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Description 

The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments probes, primers and related polynucleotides 
5 thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others. 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R. R, The Staphyloco- 
ccus as a Molecular Genetic System, Chapter 1 , pgs. 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R. Novick, Ed., VCH Publishers, New York (1990)). Species differ from one another by 80% or more, by hybridization 
10 kinetics, whereas strains within a species are at least 90% identical by the same measure. 

The species Staphylococcus aureus, a gram-positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below. 

Human Health and S. Aureus 

15 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et al, MEDICAL MICROBIOLOGY, 
Mosby-Year Book Europe Limited, London, UK (1993)). It is an etiological agent of a variety of conditions, ranging in 
severity from mild to fatal. A few of the more common conditions caused by S. aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint infections, neonatal conjunctivitis,osteomyelitis, skin infections, surgical wound 
20 infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below. 

Burns 

Burn wounds generally are sterile initially. However they generally compromise physical and immune barriers to 
25 infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable bacteria results in mixed colonization at the injury site. Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S. aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
30 produce severe septicaemia. 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread belowthe cutaneous 
35 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenes. Cellulitis can lead to systemic infection. 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically is caused by a mixture of 
S. aureus and microaerophilic streptococci. It causes necrosis and treatment is limited to excision of the necrotic tissue. 
The condition often is fatal. 

40 Eyelid infections 

S. aureus is the cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

45 Food poisoning 

Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E). Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria. 
50 Although the toxins are known, their mechanism of action is not understood. 

Joint infections 

S. aureus infects bone joints causing diseases such osteomyelitis. 

55 

Osteomyelitis 

S. aureus is the most common causative agent of haematogenous osteomyelitis. The disease tends to occur in 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

5 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
10 Recurrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
15 S. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome 0 (also called toxic epidermal necrosis, Fitter's disease and 
LyelTs disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
25 produce exfoliation(also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals norma! skin below but fluid lost in the process can produce severe injury in young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
35 disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ( n NNIS n ) S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
45 Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however. Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
50 pencillinoic acid, and thereby destroys antibiotic activity Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance. 

Methicillins. introduced in the 1 960s : largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
55 make penicillin a good substrate for inactivating lactamases. However, methicillin resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides: tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon et al., Micro- 
biology Reviews 51 : 88-1 34 (1 987)). Generally, resistance is mediated by plasmids, as noted above regarding penicillin 
resistance; however, several stable forms ol drug resistance have been observed that apparently involve integration 
of a resistance element into the S. aureus genome itself. 
5 Thus far each new antibiotic gives rise to resistance strains : stains emerge that are resistance to multiple drugs 

and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

10 

Despite its importance in, among other things, human disease, relatively little is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psil 1 psil 2 and psi13, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
is the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et al. published a low res- 

20 olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et al. 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs. 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P. Novick, Ed., VCH Publishers, New York, (1 990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001 , which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE 0 ). 

25 The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown. Among 

30 the few genes that have been identified, most have not been physically mapped or characterized in detail. Only a very 
few genes of this organism have been sequenced. (See ; for instance Thornsberry J. , Antimicrobial Chemotherapy 21 
Suppl C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 

35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controllings, aureus infections. 

40 There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
45 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment: the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-5,1 91. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
50 preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

The nucleotide sequence of SEQ ID NOS: 1-5, 191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS: 1 -5,191 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
55 to:magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
5 aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked ORF, 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs." 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
10 found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
15 lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods : well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
40 or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
so invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention: and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratories working with 
this organism and for a variety of commercial purposes. Many fragments of the Staphylococcus aureus genome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value 
to Staphylococcus aureus researchers and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models tor developing tools for the analysis of ch romosome structure and function, 
including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
5 genomic and molecular phylogeny. 

FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention. Both Macintosh and Unix 

10 platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et ai, Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 535, IEEE Computer So- 
ciety Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database. As- 

15 sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR M ) for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 

20 generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf . The ORFs are searched against S. aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et 
al, J. Mol. Biol. 215: 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1 -3.. 

25 The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 

of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS:1-5 s 191. (As used herein, the 'primary sequence" refers to the nucleotide sequence represented by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
30 present invention provides the nucleotide sequences of SEQ ID N0S:1-5,191, or representative fragments thereof, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1 -5,191" refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database. 
p re f errec j representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs") : 
35 expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample fOFs"). A non-limiting identification of preferred representative fragments is provided in Tables 
1-3. 

As discussed in detail below, the information provided in SEQ ID NOS:1-5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments" of interest, including open reading frames encoding a large variety of Staphylococcus 
aureus proteins. 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1 -5,1 91 . However, once the 

45 present invention is made available (i.e., once the information in SEQ ID NOS: 1-5,1 91 and Tables 1-3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1 -5, 1 91 will be well within the skill of the art. The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 

50 - art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resulting nucleotide 

55 sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

As discussed elsewhere hererin, polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC"). 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 
5 ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1 -5,191. Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1 -5,1 91 , in a form which can be readily 
10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ I D NOS: 1 -5, 1 91 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proa Natl. Acad. ScL USA 85: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 
15 an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS:1-5,191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
6provided B refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS: 1-5, 191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 

25 to a polynucleotide of SEQ ID NOS: 1 -5,1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof (e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

40 to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

45 readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

50 Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-5,191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes. 

55 The examples which follow demonstrate how software which implements the BLAST (Altschul etal, J. Mol. Biol. 

215:403410 (1990)) and BLAZE (Brutlag etal, Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology toORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems : particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify, among other things, commercially important f rag- 

5 ments of the Staphylococcus aureus genome. 

As used herein, n a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU) S input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available 

10 computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, 'data storage means" refers to memory which can store nucleotide sequence information of the 

15 present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information ol the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 

20 match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. 

25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 

30 protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

25 to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences). 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 

40 target sequence or target motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschul et al., J. Mol. Bioi 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 

45 readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention. The computer system 1 02 includes a processor 1 06 connected to a bus 1 04. Also connected to the bus 1 04 

so are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes 

55 appropriate software for reading the control logic and/or the data from the removable medium storage device 1 1 4, once 
it is inserted into the removable medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108. 
any of the secondary storage devices 110. and/or a removable storage medium 116. During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-5.191, to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID N0S:1-5,191 , the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention. 

The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 
double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables 1 , 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 50 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996. 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly, 
by BLASTP analysis, a polypeptide sequence available through Genbank by September 1996. 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 

In Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 
These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information. Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 

In Table 3, the last column, column six, indicates the length of each ORF in amino acid residues. 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 
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1 , 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar" 
(i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter Thus, 

5 for instance, Tables 1 and 2 herein enumerate the per cent identity 0 of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 

to artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1 -3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 

15 As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 

20 the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1 -3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 

25 of the present invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 

30 expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
40 termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1 -5,1 91 , with a sequence 
45 from another isolate of the same species. 

Furthermore: to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
50 such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureusorigin isolated by using part or all of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, 

5 while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well Known 

10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et a!., NucL Acids Res. 6: 
3073 (1979); Cooney et at., Science 241 : 456 (1988); and Dervan etal, Science 251: 1360 (1991). Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56: 560 (1 991 ) and OLI GODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 

15 (ococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further 

20 comprise a marker sequence or heterologous ORF operably linked to the EMF 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagescript; PsiX174, pBluescript SK and KS (+ and -), pNH8a, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3 : pKK233-3, pDR540, pRIT5 (available 

25 from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV 

30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse m eta I loth ion e in- 1 . Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

. A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, 
for instance, Davis : L etai, BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

40 a host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant 0 is 

45 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
50 of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, 
or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide orprotein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
5 does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 
10 host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

■Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal {e.g., yeast) expression systems. As a product, "recombinant microbiardefines a polypeptide or protein essen- 
15 tially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacteria! cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to aheteropolymer of deoxy ribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1 ) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

■Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokary- " 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

40 from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal., MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press. Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

45 transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

so of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

55 functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and : when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coli, B. subtilis, Salmo- 
nella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus. Others 
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may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g. f temperature shift or 
10 chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman : Cell 23: 1 75 
is (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

25 cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, incompleting configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides". Such im- 

30 munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et al., Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S. t J. Bac- 
terid. 174, 7345-7351; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

45 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty -six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

50 contained the sequence L-(A,S)-(G,A)-C at positions -3 to +1 , relative to the point of cleavage (Hayashi, S. and Wu, 
H. C. Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 

55 amino acid sequence (Fischetti, V. A. Gram-positive commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity. ASM News 62, 405410; 1996). The conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid. 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein"; "periplasmic", or "antigen". 

5 An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 

criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
I D NOS: 5, 1 92 to 5,255. Thus the amino acid sequence of each of several antigen \cStaphylococcus aureus polypeptides 

10 listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1, 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 

15 in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 

20 amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5,1 92-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 

25 ORF listed in Tables 1 ,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 

30 amino acid sequences shown in the sequence listing as SEQ I D NOS:5, 1 91 -5,255, or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 

35 polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 

40 instance, Geysen et al., Proc. Natl. Acad. Sci. USA 81:3998- 4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutcliffe, J. G., Shinnick, T. M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 

45 predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein -reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 

so Wilson et al., Cell 37:767-778 (1 984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non-limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

55 4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson- Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means. 
See, e.g., Houghten, R. A. (1985) Genera! method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. Sci. USA 82: 
5131-5135; this "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See s for instance, Sutcliffe et al., supra; Wilson et al., supra; 

5 Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et al., J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen etal., supra. Further still, U.S. Patent No. 5,1 94,392 to Geysen (1990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 

10 epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4 : 433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peralkylated oligopeptides and sets and libraries of such 

is peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest: Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 

40 sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 

45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1-5 : 191 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:1-5,191 can be used to prime DN A synthesis and PCR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as, for example, Innis et al., PCR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

55 When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS:1-5,191 s one skilled in the art will recognize that by employing high stringency 
conditions {e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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conditions {e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:1-5,191 , or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS: 1-5, 191, for colony/plaque hybridization, one skilled in the art will recog- 

5 nize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and 
washing at 50- 65'C in 0.5X SSPC).. sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

io Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

15 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 

20 the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications, Ltd. NY (1991) and BIO CATALYSTS IN ORGANIC SYNTHESES, Tramper et al, Eds., Elsevier 
Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar 

25 aspects of the present invention are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 

30 macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
amino acid synthesis : enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 

35 can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS: 1 -5,1 91 . 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases : and catalase. 

40 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers : in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
etal, Symbiosis2V. 79 (1986) and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY, Whitak- 

45 er et al., Eds., American Chemical Society Symposium Series 389 : 93 (1 989) . 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 

50 oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger et al., Biotechnology 6(A) , Rhine etal., Eds., Verlag 
Press, Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form lor the deoxygenation of beer. See, tor instance, Hartmeir et al, Biotechnology Letters V. 21 (1979). The most 

55 important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile., leather photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis etal., beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al., Eds., 
Academic Press, New York (1995). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu etal., Biochem. et Biophysica. Acta. 872: 83 (1 986), for 
instance. . 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 

5 of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger otaf., Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 

to Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
targe body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman et al, Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey et ai, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner etal, Report Industrial En- 

15 zymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae etal, Philosophical Transactions of the Chiral Society of London 310:227 (1985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies ef al, Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esteritication reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides. 

30 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud etal., Chemistry in Britain 

35 (1987), p. 127. 

Aminotransferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods 
40 of Enzvmolopv 1 36:479 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l. 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
so either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducing the desired antibody are well known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY. Elsevier Science Publishers, Am- 
sterdam, The Netherlands (1984); St. Groth etal, J. Immunol Methods 35: 1-21 (1980), Kohler and Milstein, Nature 
256 : 495-497 (1 975)), the trioma technique, the human B- cell hybridoma technique (Kozbor et al, immunology Today 
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4: 72 (1933), pgs. 77-96 of Cole et al, in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. 
(1985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
5 injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but 
10 are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
15 an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et al., Exp. Cell Res. 175: 109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1 984)). 
20 Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 

produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
25 be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger era/., J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. etal., Meth. Enzym. 62:308 (1979); Engval, 
E. etal, Immunol. 109:129 (1972); Goding, J. W. J. Immunol. Meth. 13:215 (1976)). 
30 The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 

or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
35 are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. etal, Meth. Enzym. 34 Academic Press, N. Y (1974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

40 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs.antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 

45 the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification 

50 or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Pub!ishers : Amsterdam, The Netherlands (1986); Bullock, G. R. etal, Techniques in 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1965); Tijssen, P., Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291, and 

55 Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985), all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above -described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive : in close confinement, one or more containers 
which comprises: (a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting 
presence of a bound DF, antigen or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 
containers include small glass containers; plastic containers or strips of plastic or paper. Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 
contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline. 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 
present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
25 and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
to skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et a/., Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1 992), pp. 289-307, and Kaspczak etai, Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

^5 in addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above; ■ 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

so One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity. 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee et at., Nucl. Acids Res. 6:3073 (1 979); Cooney 

55 et ai, Science 241 :456 (1 988); and Dervan et ai, Science 251 : 1 360 (1 991 )) or to the mRNA itself (antisense - Okano, 
J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense 
RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

5 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical 

10 agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro, ' when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 

15 of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 

20 of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
25 can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
30 routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

35 The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 

As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, 

40 REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability biological half- 
life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 

45 also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means [e.g., inhalation, intravenously intramuscularly, subcutaneously enteraliy or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 

50 preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
55 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

5 The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 

When provided prophylactically : the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10' toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 
ic ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods'to prepare pharmaceutical^ 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e. 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods maybe employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

40 or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
50 are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 

LIBRARIES AND SEQUENCING 

5 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, P 0 , that any given base in a sequence of size L : in nucleotides, is not sequenced after 

10 a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P 0 
= e-™, where m is L/n, the fold coverage." For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has 
been randomly generated (1X coverage). At that point, P 0 = e* 1 = 0.37. The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined. Thus : at one-fold coverage, 

15 approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence 
has been generated, coverage is 5X for a .2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage 
of a 2.8 Mb sequence can be attained by sequencing approximately 1 7,000 random clones from both insert ends with 
an average sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le* m , and the average gap size, g, follows the 

20 equation, g = L7n. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1 988). 

2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end. 

Staphylococcus aureus DNA was prepared by phenol extraction. A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate, 10 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min. at O'C in a Branson 

30 Model 450 Sonicator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 100 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°C in 200 ul BAL31 buffer . The digested DNA was phenol-extracted, ethanol-pre- 
cipitated, redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting 

35 temperature agarose gel. The section containing DNA fragments 1 .6-2.0 kb in size was excised from the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector. 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC18 DNA (Pharmacia) cut with Smal 

40 and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 nr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and electrophoresed on a 1 .0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i), vector (v), v+i, 
v+2i, v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 

45 into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37° C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at 14*C overnight. After 10 min. at 70°C the following day, the reaction mixture was 

50 stored at -20°C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.coli host cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements, 
55 deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1.42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 min. A 1 u I aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

s mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%), 1 ml MgC^ (1 M). and 1 ml MgS0 4 /1 00 ml SOB agar. The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

10 AN colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

75 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with 5Prime -> 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 1 00 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x1 0 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1x10 s pfu/ml. 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and 77 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M1 3RP1 
sequences and 65% for dye -terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. Protocol for Automated Cycle Sequencing 



The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 
primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e.., one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e., 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay. 

Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 
terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis, detection, and base-" 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents tor sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

5 Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 

per day. Electrophoresis was run overnight (ABI 373) or tor 2 1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 

10 itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polyl inker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

15 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage er a/., Proceedings of the Twenty-Sixth Annual Hawaii International inference on System 

20 Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 

25 on a Unix platform, it was necessary to design and implement a variety ot multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

30 An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 

fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number ot potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. 

35 Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S. : 
Methods in EnzymolQQV 1 64 : 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length ot an unmatched 

40 end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 

45 coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

so 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorl, 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence 
55 matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. 
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ORFs of at least 1 20 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1 . Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 E. coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few microgram s/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from ' 
murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E. r Meth. Enzymol. 70:419 (1980), and modified metri- 
cs ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. etaf. Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in vaitukaitis, 
J. etai, J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchterlony, O. et al., Chap. 19 in:Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1 973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mgAnl of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds.. Amer. 

4$ Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

50 the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS: 1 -5, 1 91 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow. 
4. Gene expression from DNA Sequences Corresponding to ORFs 

5 A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 

70 proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield etal., U. S. Patent No. 5,082,767, incorporated herein by this reference. 

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 

15 Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5' primer and Bgll I at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bgll I, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 
* 25 The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 

New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product. 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 

35 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

40 texts such as Davis etal., cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding: one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

45 scope of the invention. 

All patents : patent applications and publications referred to above are hereby incorporated by reference. 
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Table 4 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



QRF SEQ ID N O 



BLAST 



Antigenic Re g ions 



HOMOLOG 



Region 1 Re gi on 2 ! Re g ion 3 Region 4 



168_6 1 5192 lipoprotein 



36-45 



84-103 ' 152-161 



176-185 



238.1 



5193 chrA 



21-39 



48-58 



84-95 



51.2 



278_3 



5194 OpjjB ge ne product (B. sub- 20-36 

5195 l ipo protein! 20-29 



70-79 



59-73 



1P0-1J_2 
85-97 



232-249 
121-131 



276.2 I 5196 lipoprotein 



21-33 



65-74 



177-186 



162-171 
211-220 



45.4 I 5197 ProX 



28-37 



59-69 



85-100 i 120-129 



315.8 



i 154.15 



228.3 



228.6 



51 98 hypoth e tical protein 

5199 '■ unknown 



45-54 



88-97 



182-192 



31-40 



48-58 



5200 j unknown 

5201 Unknown 



25-38 



40-52 



29-41 



89-101 



79-88 



64-74 



128-143 



50.1 



112.7 



442.1 



66.2 



304.2 



5202 unknown 



! 21-33 



52-61 



5 203 iron-binding periplasmic 



21-31 



58-67 



5204 1 unknown 



30-39 



91-100 



5205 '. unknown 



50-59 



44.1 



5206 jQ-btndinq periplasmic 



19-28 



i 

_i 

i 



104-116 



48-57 



168-182 



92-101 



122-137 



127-136 



2 43-25 3 
95-104 



80-89 



173-184 



197-206 

"in^io 

182-192 



i 

I 



161.4 



_5207 ! h ypothetical protein 



27-36 



86-95 



75-84_ 
129-138 



5208 :SphX 



27-44 



149-161 



46.5 



942.1 



5.4 



20.4 



328.2 



520.2 



771.1 



999.1 



5209 crnpC (permease) 



21-33 



61-70 



5210 traH fPlasmid pSK41 ] 



83-92 



5211 :ORF (5. aureus) 



12-22 



5212 ■ p ep tidoglycan hydrolase (S: 24-34 



109-118 



87-96 



129-138 



166-175 
83-92 



167-18 2 
T 103-116 
"192-201 
201-210 
100-109 



T" 
I 



127-142 



11 1-120 



141-150 



1 5 1 £1 60 
"161-171 



5213 lipoprotein (H. flu) 



81-90 



123-133 ! 290-299 



5214 j fibronectin binding protein : 44-54 



63-79 



81-90 



95-110 



5215 j emm1 gene product (S. py( 30-39 



65-82 ; ? 6- 106 j 1 12-121 



5216 ' pr edicted trithorax prot. (D 7-1 6_ 



120-129 



157-166 



853.1 I 



287.1 



521 7 ORF21 36 (Marchantia polyr 43-52 

5218 psaA homolog 1 3-22 



88-97 



102-111 ! 



28-44 I 72-82 



1 1 4-1 24 



288 2 5219 cell wall enzyme 



14-23 



89-98 ! 



596.Z S 5 2 20 penicilli n binding protein 2b 40-49 



59-68 



~7\7 S i ~5 221 fibronectin/fibrinogen bindit 28-37 



40-49 



76-87 ; 106-115 

62-71 i 93-111 



217.6 1 5222 fibronectin/fibrinogen bp 



528_3 I 5223 myosin cross reactive prote 
171.11) 5224 EF 1 



10-19 
4-13 



31-40 I 54-62 



29-47 I 60-73 



20-31 



91-110 



63.4 



353.2 



74311 



5225 • penicillin binding protein 2b 12-21 



59-68 



5226 



46-55 I 62-71 



95-104 



5227 29 kDa protein in fimA regw 23-32 



68-79 



94-103 



342.4 I 5228 Twitching motility 



10-19 



48-60 



83-92 



69.3 i 5229 arabinogalactan protein 



97-106 



132-141 



158-167 



70.6 



5230 nodulin 



36-45 



48-57 



137-160 



129.2 ! 5231 glycerol diester phosphodie 8-17 



41-50 



55-74 



58.5 
18£>=i 

235.6 



5232 PBP (S. aureus) 



26-35 



70-79 



117-126 



5233 MHC class H analog (S. aure 72-81 



94-103 



115-124 



5234 histidine kinase domain (Pic 24-33 



52-67 



81-94 



310.8 : 5235 clumping factor (S. aureus) 59-71 



77-86 



93-102 



601. 1 ' 5236 novel antiqen/ORF2 ( S, aui 45-54 



91-104 



108-117 



73-92 



90-99 



175-184 



1 11-121 
"18 0-189 
179-188 



97-106 



152-161 



1 36-14 5 
"106-1 21 ' 



118-127 



186-195 



544. 



3 : 5237 QRF YJR1 51 c (S. cerevisae! 76-90 101-111 131-140 154-164 



662 .1 5238 MHC class H analog (5. aure 22-32 



71-80 



89-98 



114-122 



J7.7 
120.1 



5239 5' nucleotidase precursor (' 29-45 

5240 ~ BS5G gene product (B. sub 102-1 1 1 



62-71 



105-114 



125-137 
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Table 4 



ORF Antigenic i Regions (cont) 


Regions Region 6 Region 7 Region 8 Region 9 . Region 10 


168 6 244-272 303-315 ! ! i 


238 1 i 260-269 291-301 j 308-317 ! 


51 2 i 140-152 188-208 ! 211-220 I 256-266 ■ 273-283 = 


278 3 I 198-209 i i i 


276.2 1 255-268 1 


1 i 


45 4 i 177-199 : 221-230 I 234-243 


268-279 ! 284-293 \ 304-313 


316.8 , ; 


i 




154_15 : 148-157 : 177-187 


202-211 


l 


228 3 ! 101-119 . 139-154 


166-181 




■ 
■ 


228.6 


• 






7 
1 


50.1 








\ 

i 


112.7 


136-149 197-211 


218-229 


253-273 ! ! 


442.1 


199-210 ! 247-257 


264-277 


287-309 ! i 


66.2 


■ 






1 


304.2 


178-187 : 250-259 










44.1 ! 










isi_4 ; 




, 1 


• 




46_5 ! 131-141 162-176 


206-215 


243-252 


264-273 I- 285-294 


942_1 


• 






i 


5_4 


189-205 230-239 1 246-264 


301-318 


340-354 | 378-387 


20 4 ; 202-212 217-234 


260-275 


314-336 


366-373 j 380-391 


328.2 ! 






1 


520.2 1 






* 
• 


771 1 i 145-154 






• 

• i » 


999 1 ! 




♦ 


853 1 ! 


1 ! 1 ; 


287.1 


154-164 


III! 


288.2 


i 


1 1 1 




596 2 ; 121-130 


! 




217 5 244-253 259-268 


288-297 ! 302-311 | 




217_6 ! 144-158 174-183 


188-197 i 207-216 j 226-242 




528 3 i ; 


171.11 


! i ! 


63 4 i 






i 


353 2 ! 






i 

« 


74* 1 : 197-207 ! 


! ! 

■ 


34-2 4 i 1 ! i 


69 3 195-211 ! 


70 6 206-215 263-272 291-301 1 331-340 358-371 ! 390-414 


129 2 : 117-127 141-157 168-183 202-211 222-231 ! 261-270 _ 


58 5 • 184-203 260-269 : 275-299 : 330-344 372-381 . 424-433 


188 3 ■ : 


236 6 138-147 163-172 l - 187-198 i 244-261 268-278 308-317 


310 8 * 131-140 144-153 : 177-186 ! 190-199 . 204-213 216-227 



601 _1 ' 208-218 [ I ! 

544^ 3 . 170-179 ~' 184 -1 93 ! 224-23 5 j 274-287 327-336 352-36 1 



662.1 

87.7 i 

12CL1 
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Table 4 



ORF I Antigenic! Regions j(cont) ! 


! Region 11 Region 12 ! Region 13 • Region 14 : Reaion IS 1 Region 16 


168_6 1 i i 


238_1 ; ! I i 


51_2 ■ j ! 


278_3 i ! j 






276.2 i I i 


• 

. _ . . . 


4S_4 ! | i 


i 

. 




316.8 




* 

: i 

• 


154_15 i 




! 




228.3 ! 




• 
■ 




228.6 


» 

i 




• 




50.1 


t 




| 




112.7 


i 




■ 




442.1 I 




t 

• 

i 




66.2 | 




• 




304.2 I 




■ 
• 




44.1 






i 


161.4 


• 1 


• 




46.S 


! 306-315 






• * 


942.1 i i 




i 

• 




5.4 « 393-407 ■ 416-426 


456-465 






20.4 i 396-405 : 410-419 I 461-481 


i 




328.2 ! 1 


I 


i 

i 




•520.2 j I 




4 
1 




771.1 1 ! 


! , 


< 




999.1 i 








853.1 ! • 








287.1 ! 




; i 


288.2 ! 




i ... , , 

: i 


596.2 i 




: I 


217.5 1 1 


i 

* • 


217_6 i 




• 

i » 


528.3 i 




• \ 


171.11 ; 




• 


63.4 1 




i 
l 


353.2 ! 




1 

• * 
■ ■ 


743-.1 ! 




• 


342.4 1 




1 




69.3 \ 


i 




70.6 ! 453-471 506-515 « ! 




129.2 i 296-315 1 1 


58.5 ■ ! 


188.3 ' ! 1 i 


236.6 1 358-377 410-423 i 428-439 «442-457 467-476 : 480-493 


310.8 1 238-251 256-275 ! 281-290 !296-310 314-333 3^8-347 


601.1 ! ; ! 1 



544_3 



662J \ 

87_7 J 
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ORF 




Antigenic; Regions 


(cont) 












Region 1 7 


Ppninn 1 ft ! 


Region 1 9 


Region 20 




Region 21 




Region 22 


1 68.6 i 




• 

i 




♦ 


i 

t 








238.1 








1 










51.2 




! 
i 

* 




i 

i 


■ 








278-3 i 




1 






■ 




i 

mm* 




276.2 ; 


i 






i 

-} 




1 




45^4 1 








i 
i 
• 


i 




1 

™ 1 




31 6_8 




• 




• 


* 

4 
< 




■ 




154/15 ! 






i 


1 




1 

1 




228-3 






i 
\ 


i 


228.6 








* 

i 


50-1 










fl2_7 1 


t 




t 




442.1 




i 
■ 








66_2 


• 
i 








'"304 2 ! 




■ 








44-1 


» 

• 






! 








161 4 1 ! 


• 


i 


4S_5 j 


1 


s 










942 1 


! 


1 

t 






5.4 


i 






• 




i 


20 4 


• 
i 
■ 




i — 


« 




• 


328.2 


i 

J 






1 


— 1 






j 

■ 


520_2 


1 


.1 


1 

1 








i 

* 


771.1 


i 

i 


i 






• 


999.1 


i 

• 


1 


; 
■ 




• 


853.1 


1 




i 


• 




I i 

• 


287 1 


i 






i i 


288.2 


| 




• 
* 




596.2 


• 

\ 






♦ 


2175 


* 








217 6 








j 






528.3 


i 










i 






171.11 


• 
■ 










i 






63 4 


i 




i 






i 


353.2 


t 

\ 




i 


i 




* 


74311 


T 

i 










• 

• 




■ 
1 


342.4 


1 




• 






; 




I 

1 


69 "3 


t 




■ 










1 


70 fi * 


129.2 






a 
1 
1 










■ 


58.5 


• 

ft 




» 
i 

I 






i 






188.3 


• 
■ 




■ 
■ 






* 






236.6 






1 
1 






i 






310.8 


357-366 


370-379 


1429-438 


443-452 




.478-487 




551-560 


601.1 






■ 

• 






* 






544-3 


















fifip 1 


87_7 


















120 1 
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10 



15 



20 



25 



ORF 



168.6 



238.1 



51.2 



Region 23 



Antigenici Regions l(cont ) : j 

Region 24 j Region 25 I Re gio n 26 Re g ion 27 J Region 28 ^ 



278.3 i 



276.2 



I 



45.4 



316_8_ 
154.15 



228_3 



228.6 ! 



50_1 



I 



112.7 



442.1 



66.2 



304„ 2 
44.1 



161.4 



46.5 



942.1 



_5_. \_ 
20.4 



328.2 1 



..j — 
■ 

I 
i 



.4 

! 



30 



35 



40 



45 



50 



55 



520.2 



i 



771.1 i 



999.1 



853.1 ! 



287.1 



288.2 



596.2 



217^5 



217_6 



528.3 



17111 



63.4 



353^2 
743.1 
342_4 



69.3 



70.6 



129.2 



58.5 _ 
188.3 



236.6 



310_8 622-632 

601.1 

544.3 

662 1} 

87^7 " 

120.1 



670-685 i 708-71 8 



823-836 



858-867 



877-886 



1 
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Table 4 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ORF 


Antigenic; 


Regions 




(cont) 




Region 29 \ 


Region 30 


i 
i 




168.6 


i 




• 

i 




238^1 


• 

• • 1 




! 




51.2 


1 




■ 




278.3 i i 


276.2 j 


i 




I 




45.4 s 


i 

• 

i 




i 

■ 




316.8 


■ 








154.15 ' 


■ 




■ 




228.3 


i 




1 

t 




228.6 








50.1 


— r — - 
! 






1 1 2.7 


1 

1 


* 




442.1 






66.2 






304.2 


i 


44.1 


1 


1 61 _4 


• 


46.5 i 


! 


942.1 




i 




5.4 


i 

■ 


20.4 


i 


i 
i 




328.2 


1 

l 


■ 




520.2 ■ 










771.1 










999.1 




1 

i 






853.1 ■ 1 




287^1 




i 
i 






288.2 




» 
t 






596.2 




i 






! 217.5 • 




I 


i 


217.6 ' 




i 




i 

i 


528.3 




i 




i 
• 


171_.il ! 




• 




* 


63.4 1 




i 




i 
i 


353.2 . ! 




* 
■ 






74£J ! 










342.4 i 




i 

i 




i 


6913 




■ 




i 


70.6 


1 29.2 




* ■ » 






58.5 










188.3 


236.6 ; 


310.8 










601.1 








• 


544.3 








• 


662_1 


87 7 


120 1 



212 
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Table 4 



10 



15 



20 



ORF 



BLAST 



Antig enic! Re g ions 



HOMOLOG 



Regi on 1 : Regi on 2 \ ftegi_on_3 j Region r 4 



46,1 i52 41 
63 _4 5242 



aldeh yde dehydro genase 



8-17 



glycerol e ster hydrolase (P. _ 9-26 



174_6 



524 3 ket opantoate hydroxymeth 71 -80 



36^2 
_S7-73_ 
203-2 1 2 



83-96 



206, 1 6; 5244 ornithine acetyl tran sfera se 1 -1Q 



_; 34^43 

267.1 S5245 NaH-ant ipo rter protein (£. r 120>129 i 332-347 



93-107 
2_42-2S4~ 

5_4263 
3198-408 



123-133" 
265:274 
194-21*0 



_322_ 1 15 2 46 acrifla vin resistance protein 58-75 



153-1 64 ; 203- 231 | 2 64-28 4_ _ 



415.2 5247 
2 14^3 "15248 



transp o rt ATP-b inding pro t< 1 08-1 26 i 21 8 -227 
2-nitro pro pane dioxygenase 1 23-1 36 2 1 6-233 



587.3 15249 



-IS-lyiBR' ngJacto_r_ 



5-14 



43-54 



685^1 15250 



sjgna[ pep tidase 



59 : 68 

54, 3 i 5251 fl bronec tin binding prot ein I _ 23-32 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

15 (F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly 

nucleotides and Sequences 



(iii) NUMBER OF SEQUENCES : 5255 



(v) COMPUTER READABLE FORM: 
2$ (A) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4 Mb storage 

(B) COMPUTER: HP Vectra 486/3 3 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



(2) INFORMATION FOR SEQ ID NO:l: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 

10 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

r5 aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 130 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GGAGCATATA 300 

20 CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 360 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 420 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 480 

25 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 540 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

30 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 720 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 780 

3S TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 840 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

40 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 1020 

TCACTAAAAA ATAAGATGAA TAATTAA7TA CTTTCATTGT AAATTTGTTA TCTTCGTATA 1080 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

45 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 1260 

SO TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 1380 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 1440 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC TGAGTTATCT ATTATAAAAA AATAAACATA 15 SO 

GACTTATGAA AAATCTCTCA TAAATCTATG TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

CGGGCGCTTC TTATTTATAC AAATCTAATT TAATACTTTT AAATACAGGT ATATTTTCgC 1680 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT CTACAGTCAA AATATCTGCG GATTCATTTA 1740 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA TGACAAACAT ATTATTTTCA ATTGCACGTG 1860 

CCTTTAGTAA TGAATGCCAA TGTTGAAGAC GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

75 CAATTTTAGC ACCACTACGA GCAGGATATC TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

AGATAAGTTG GGTCACATAA GTACCGTCAG ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

CAGCGGTTAA AAATTCATGC TCTCTTAACA TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

20 TAATCAGCTG GCCACTTTTA TTCACACTAA AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

TGTTAGAAAC TGACCCAGCT ACGATATCGA CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

ATGAAAAACT TTGTCCTAGA TTATTATCTG CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATTATTCCA CATTTCAGGT AAAACGACTA CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

ACCATTGCGT TATTTGAGTT TCATTTTTAG AACTATCTCC AAAAACAATC GGTAATTGAT 24 00 

AAATTTGGAC TTTCATAACA TCACATCCTT GATAGATCTT ATATATAACT TACTAAAAGT 2460 

TATGTTGAAA CGCAAAAAAC GAGCACAAGA CATAAAATCA AAGTCCTAGG CTCTACAAAG 2520 

TTATATTGAC AGTAGTTGAT GGGGCCCCAA CATAGAGAAA TTGGAACACC AATTTCTACA 2580 

35 GACAATGCAA GTTGGGGTGG GCTCTAACAT AAAGAAATAC TTTTTCTTTA GAAATTAGTA 2640 

TTTCTTATAC ATGAGTTTTA CTCATGTATT CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

AATGTGTAAG AACTACTACA TAATGAATAA CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CCTAACAATA TATTGATTAT TTTTTTATTA CGAAACGATC TTCCACTGGA TTAAATGTTT 2820 

TTTCGCCAGC AGCTTCACGA ATATCACCAA ATGGCATTTG AG CAATAAGT TTCCAACTTT 2880 

TAGGAATATT AAATTCATTT GAAGTCATCT CATCAACAAG TGGATTATAG TGTTGTAATG 2940 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG TCCAAATTGC AAATTGATGC ATGGCATTTG 3000 

TTTGAGTTGA CCATATTGCA AAATTATCAT AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3060 

50 TTACAACATC TTGATCTTCA TAAAACAAAA TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TTTTTTGTTC AGTTGGCTCG AAATCACGAT TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

TTGTGTTATC CCAAAATTTA TTATTGTTGT CATTTAACAA GAGAACAATT CTAGTTGATT 3240 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TCCATTGCAT 
TGTCAAAAGT CATTGCTTTT TTATCTTTTT TAAATAAGCC CATAATTATT GCTCCTTCTT 
TAGTAAAGAA TACTTAATAG ACTAAGTATA AAATTTATAC TCGTACTTGT AAAGCAATAT 
TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATGCATTTTC 
AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 
TGTCATATCA TTGGTTTAAG AAAATGTTAC TTTCAACAAG TATTTTAATT TTAAGTAGTA 
GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 
CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 
ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 
GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGCCAGA CTCAAACAAC CAAAACCCAA 
GTACAGATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAACCAGATC 
CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 
CGGATCCAAA ACCAGATCCA GATAACCCGA AACCAAATCC AGATCCAAAA . CCAGACCCAG 
ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 
ATCCAAAACC AGACCCTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 
GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATGCTTCAG 
ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 
CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAGTATA 
ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 
AAGACATTTA TAATATTATT CGAAAACAAa ATTTCAGCGG AAATGCATAT TTAAATGGAT 
TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 
ACTATCGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 
CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 
AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 
CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 
TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 
CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 
TCCGTGGCTT TTTTCAAAGC CTCAGGATTA AGTAATTGGA ATATAACGAC AAATCCGTTT 
TGTAACATAT GGATAATAAT TGGAACAGCA AGCCGTTTTG TCCAAACATA TGCTAATGAA 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 5400 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 5460 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

15 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 5640 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2; 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGftAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 60 

* 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

CCGATTATAC AAATTAATTT GGG AACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 540 

55 



220 



EP0 786 519 A2 



10 



20 



25 



AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 7 BO 

CG G TTT CTT T AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 840 

TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT '900 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

15 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 10 BO 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 1140 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 1320 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 13 80 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 1740 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2040 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TGATGCATAG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2340 
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TATGTAGAAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 2460 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2520 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2580 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 2640 

TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2700 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2760 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2820 

m 

15 AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 2940 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 3240 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 3300 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 33 60 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 3420 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 3480 

TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 3540 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAQCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 36 SO 

TAGAACATTT* CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3780 

45 CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 3840 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3900 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3 960 

CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4020 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4080 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 4140 
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ATTACAACAG TTTAGTGGTA ACTTAGAAAG AGCTGCTGTT GAATTGGCAC AAGAATGGCG 4 260 

AGGCGATAAA CAATTACGTC AATTAGAAGC TATGCTAATT GTAATGGATA AAGATGCTAT 4320 

TTTAGTTGTC AGTGGAACTG GCGAAGTTAT TGCTCCAGAT GATGACCTTA TCGCTATTGG 4380 

ATCAGGAGGC AACTACGCAT TAAGCGCAGG ACGTGCATTG AAACGCCATG CATCGCATTT 4440 

GTCTGCTGAA GAAATGGCAT ATGAGAGCTT GAAAGTAGCG GCTGATATTT GTGTCTTTAC 4 500 

CAACGATAAT ATTGTTGTCG AAACACTATA ATAATCAGAG CACGATAAAT AATTACGAGC 4560 

AATTAATTTT AGTTAAAAGA CGGAGGAATG AAATTAATGG ATACAGCTGG AATAAGATTA 4620 

15 ACTCCAAAAG AAATCGTATC TAAATTAAAT GAATACATCG TTGGACAAAA TGATGCTAAA 468 0 

CGTAAAGTGG CAATTGCCCT ACGTAATCGA TACAGAAGAA GTTTATTAGA TGAGGAATCA 474 0 

AAGCAAGAAA TTTCACCTAA AAATATTTTG ATGATTGGAC CAACCGGCGT TGGTAAAACT 4800 

GAAATTGCAA GAAGAATGGC CAAAGTTGTC GGCGCGCCAT TTATAAAAGT AGAAGCTACT 4860 

AAATTTACTG AGGTAGGTTA TGTAGGACGA GATGTTGAAA GTATGGTTAG AGATCTTGTT 4920 

GATGTTTCAG TAAGATTAGT CAAGGCGCAG AAAAAATCAT TGGTACAAGA TGAAGCAACA 4980 

GCTAAGGCCA ATGAAAAACT TGTTAAGTTA TTAG7TCCAA GTATGAAAAA GAAAGCGTCT 5040 

CAAACGAATA ATCCTTTAGA GTCACTTTTC GGAGGTGCAA TTCCAAATTT CGGACAAAAT 5100 

30 AACGAAGATG AAGAAGAACC ACCTACTGAG GAAATTAAAA CAAAACGTTC TGAAATTAAG 5160 

AGACAGCTAG AAGAAGGCAA ACTTGAAAAA GAAAAGGTAA GAATTAAAGT CGAACAAGAT 5220 

CCTGGTGCTT TAGGTATGCT AGGTACAAAT CAAAATCAGC AAATGCAAGA GATGATGAAT 5280 

CAATTAATGC CTAAAAAGAA AGTTGAGCGA GAAGTTGCTG TTGAGACGGC AAGGAAAATC 5340 

TTAGCTGATA GTTATGCGGA TGAACTAATT GATCAAGAAA GCGCTAACCA AGAAGCGCTT 54 00 

GAATTAGCAG AACAAATGGG TATCATCTTT ATAGATGAAA TCGACAAAGT TGCGACGAAT 5460 

AATCATAATA GTGGTCAAGA TGTCTCAAGA CAAGGTGTTC AAAGAGATAT TTTACCTATA 5520 

CTTGAAGGTA GCGTTATTCA AACCAAATAT GGTACTGTGA ATACTGAACA TATGCTGTTT 558 0 

45 ATAGGTGCTG GAGCTTTCCA TGTATCTAAG CCGAGTGACT TGATACCAGA ATTGCAAGGT 564 0 

CGTTTTCCGA TTAGAGTTGA ACTTGATAGT TTATCGGTAG AAGATTTTGT AAGAATTTTG 57 00 

ACAGAACCAA AATTGTCATT AATTAAACAA TATGAAGCAT TGCTTCAAAC AGAAGAAGTT 5760 

ACTGTAAACT TTACCGATGA AGCAATTACT CGCTTAGCTG AGATTGCTTA TCAAGTAAAT 5820 

CAAGATACAG ACAACATTGG TGCACGTCGA CTTCATACAA TTTTAGAAAA GATGCTAGAA 5880 

GATTTATCAT TCGAAGCACC AAGTATGCCG AATGCAGTTG TAGATATTAC CCCACAATAT 594 0 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 
TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 
AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 
CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 
AGTGAATATA CAGAACGATT . AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 
AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 
ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 
GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 
GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 
ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 
GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 
ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 
CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 
TTAGAAAAAA GTAAAT 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 3: 
ATCCTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GGACGATCAT 
kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 
GTACGTATAT TATGTTGTTT AGCAAAATCA CGTTTTACAG CTAAAGCATA CGTATTGTTA 
TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTTAGCTTGT 
TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 
GTTCCAGTAA ATTCTAAATA CCCATCTATA TCGTCAGATT TTAAAGCATT AAATAAAAAT 
GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TTCTATTAAA 
ATTTTATACA TATTTGTAAT AATCGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 
TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATGCTGCT 1 72 0 

5 AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 840 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

10 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 1020 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 1080 

15 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 114 0 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 1380 

25 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 1440 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

« 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 1560 

30 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 1680 

35 TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 174 0 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 1800 

AAGTRATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 1860 

40 TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AA7ATAACC7 TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 2040 

45 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2073 
(2) INFORMATION FOR SEQ ID NO: 4: 

SO (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ" ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 60 

5 AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 240 

10 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 300 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 420 

75 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 4 80 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 540 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 660 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 720 

25 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 730 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 840 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

30 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 960 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

35 ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1030 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACC&AACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

40 

TTGCAACAAC C ATT CAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

45 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 1440 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

50 TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 1620 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGctAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT I860 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 1920 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 1980 

ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 2160 

IS AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 2220 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 2280 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 2340 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 2400 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 2460 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2580 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 2640 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

- TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 2760 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2820 

AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2880 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2940 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 3180 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 3240 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3300 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 3360 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 20 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 34 80 
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GTTTTTTGAC CAAATGTTGG 
GACCAGAATG GACCAGGCGC 
GAAAAACTTT TAGTTAATGA 
GGACTAGGAT TATAACCTTG 
TATTCCAATG CTTTTTGAAC 
GTAAATGCCT CAGTTGTAAA 
AAAGTAACAT TATCTAAGCC 
AACTGTTCAT CTCTTATATC 
ACTTGGCGTA CTTCTTGTGC 
GCACCTTCTT TAGCATACGC 
ACTAATATTT TATAGCCTTG 
GGTGCTGGCG TCATTTCAGA 
GTTGTTTTAA ATTTTGTTCT 
TTAATGGTTA TTATTTACCC 
CTTTCTTCTT TATAAAAACA 
CAAAATACTC AGGTACTTTT 
AATCATATTC ACTATGAGAA 
CTGAAAAATC TTCTCCAATC 
GTTGGGCTAC TGCAATTGCT 
AATGCGTATA ATTTAAATTA 
CTTGTAATCG TGTTTCTACA 

i 

GTACATACGC ATGATCAGCA 
TTACTACCGC TTCATCAAAC 
TCAATTGCGC CAACACAATA 
CCACGCCTTT AATATGAAAC 
TTGGAAATGT ACCTACCGAA 
CTTTTAATGC ATGTGTTTCA 
GAAAAATGAA TTTAACACGC 
TAGCCAAAAT ACTAGCCATG 
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GATTTTACTT TGAGGTTGTC 
TACACAGTTC ACTCTAATTC 
AATAATTGCT GCTTTTGAAG 
TACAGATGAT GTCGTTGTAA 
TGTCCAAAAT AGCGGATAGA 
TCCATGAATA TCATCATGAT 
ACCTAATTGT TGATATGCTT 
ACCAGGAATT AACACTGCCT 
ATCTTGTTCT TCACTCGGAA 
AATTGCTGCT GCACGCCCTA 
TAAGCGTTGA TGACCTTGGT 
TTGTAAACCC GGTACCTCTT 
AGGATCTTGA GCTGCCATTT 
AATCTTCCTA GGAACTTAAT 
AGCTCGAATT ATTCATGCAA 
TCCAGAATCC TTTCATCCGG 
CCAATTAACG CAAATACACT 
GTAAGCGGCT GTTCCATCAT 
TTATGCGTCA ATGCCTCATC 
ATTTTCATAT TATATGCTTG 
AGCTTTCGTA CCACAGGATC 
ATGACATTCC AAGTATTACC 
GCAGATAGAT TTCTACTAAC 
ACTGGATCGT TGCATTGTTC 
TCAAAACGAT CTACTGCTGA 
CGCGATGGGT CATTATGAAA 
ATAATTTTAA AAGCGCCATG 
CCAGTAAGAG TGCCCTCAAT 
TGAATATCAT GACCACACGC 



CACCAGAAAT TTGTAATGGT 
CTTTTGGTCC TAATTCTTCT 
CGGCATAATC ATGAAGAATA 
TTGACGCACC CGGTTTTAAA 
CATTCGTTTC AAATGTTTCT 
ACTGTTGATG TCCAGCAACT 
GTTCAACAAG GTCATAGTTG 
TTTGACCACT TTCTTCAATC 
GATAGTTAAT CGCTACATCT 
TTGCTGAGTC ACCACCTGTG 
AAGACGTTTC GCCACAATCG 
GTTCTTGTTT TTCATAATCC 
TTTTACATCT CCTTATTCGC 
CATGATTACA CTAAAAATTA 
TAGTCTCTTT ACAAATTCAA 
TTTATATTGA GGATGATGTA 
TGGAAAATGT TGACTATAAC 
TCCCACCTTA TATCCAACAT 
ATTCATCACA GCGCCAGGTA 
AGCCAATCCG TCCGCAATAT 
AAAACTACGC ACTGTGCCTT 
ACATGATATT TGTCCAATTG 
TATGGATTGA ATACTATTAA 
TGGcTTTGCA GCATGACCAC 
TGTAATTGCC CCTGTTTTGA 
ACCCAATACT GCTTGTACAT 
TCCTAGTTCT TCTGCTGATT 
TTCTTTTAAT TTTACAGCTG 
ATGCATAACA CCTTCATTTT 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 5400 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT T TTTTGTGTA GTCTTAAATT 5460 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 5520 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5580 
TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT . 5640 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 57 00 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 5760 

15 GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5830 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 5940 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 6000 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 

30 AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 6300 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 6360 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 6420 

CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 6480 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCTAAATCAT 6540 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6600 

CATAACCACT C7TACTTTCA ACTGCAAGCA CGCCGTGTTT AATCATAGTA AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 6720 

45 ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 67 80 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6 840 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6900 

CAGCATATAC AATTTTGCCA TCT7TAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6950 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7020 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTATCAATCA TTGGAATATG 7080 
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AACACCCATA 


CCTGGGTCAG 


TCGTCAATAC 


ACGTTCCAAT 


CTTCTTTCAG 


CACGCTCTGA 


7200 




TCCATCTGCT 


ACAACAACCA 


TACCCGCATG 


AAGTGAATAT 


CCCATGCCAA 


CACCGCCACC 


7260 


5 


GTGATGGAAT 


GAAATCCATG 


AACCACCTGC AGCTGTGTTA ATGAGTGCAT 


TCAATACAGC 


7320 




CCAATCACCA ACCGCGTCAC 


TACCATCTTT 


CATACTTTCT 


GTTTCACGGT 


TAGGACTAGC 


7380 


10 


AACTGAACCA 


GCATCTAAAT 


GGTCTCGTCC 


AATAACAATT 


GGTGCTGAAA 


TTTCACCGTC 


7440 


ACGTACAAGA 


CGATTTAAAG 


CTAAGCCCAT 


TTTCGCTCTT 


TCTCCATAGC 


CTAACCAAGC 


7500 




AATACGTGAT 


GGTAGTCCTT 


GATATGAAAT 


TTTTTCTTCA 


GCTAAATCAA 


GCCATCTTAA 


7560 


15 


TAACTTTTCA 


TTTTCTGGGA 


AAAGTTTGCG 


CATTTCTTCA TCCGCACGCT 


CGATATCTTT 


7620 




TGGATCACCA 


CTCAACGCAG 


CAAAGCGGAA 


TGGCCCTTTA 


CCTTCACAGA ATAATGGTCT 


7680 




AATGTAAGCT 


GGTACAAAGC 


CTGGGAAGTC 


AAAAGCATTT 


TTCACTCCGT 


TATTGAAGGC 


7740 


20 


TACTTGACGA ATATTGTTAC 


CATAATCAAA 


TGCTACAGCG 


CCACGTTTTT 


GGAATTCAAG 


7800 




CATTAATTCA ACATGCTTTG 


CCATTGAAGC 


TTGTGACAGT 


TCAACATATT 


TTTTCGGATC 


7860 


2o 


TTTTTCACGC 


AATACTTTCG 


CTTCTTCTAC 


AGAGTATCCT 


TGTGGCACAT 


ATCCATTTAG 


7920 


CGGATCATGT 


GCACTTGTTT 


GGTCAGTAAT 


AATGTCAATT 


TTAAATCCTT 


TTTCTAGAAT 


7980 




CGCTTGATGG 


ATGTCTACAG 


CATTTCCAAC 


TAACCCGATT 


GATAATCCTT 


CTCCACGTTC 


8040 


30 


TTTCGCCTCT 


TCTGCTAATT 


TTAATGCTTC 


ATCTAAATCA 


GCTGTTTTAA 


CATCACAGTA 


. 8100 




TTTCGTATCA 


ATTCGCTTAT 


CAACACGTGT 


TTCATCAACA 


TCCACGCAAA 


TTGCTACCCC 


8160 




ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA 


CCACCTAAAC 


CTGCTGTCAG 


8220 


35 


TGTAACAGTG 


CCTGCTAAAT 


CTCCATTAAA 


GTGTTGATTA 


CCTAGCTCGG 


CAAATGTCTC 


8280 




ATAAGTACCT 


TGCACAATAC 


CTTGAGAACC 


AATATATATC 


CAACTACCGG 


CTGTCATCTG 


8340 


40 


TCCATACATG 

* 


ATTAAACCTT 


TTTTATCTAA 


TTCATTAAAA 


TGATCCCAGT 


TTGCCCATTC 


8400 


AGGCACTAAT 


ACTGAATTTG 


AAATTAATAC 


ACGTGGCGCT 


TCTTCATGTG 


TTTTAAATAC 


8460 




AGCAACTGGC 


TTTCCTGATT 


GTACTAACAT 


TGTCTCATCT 


GATTCTAATT 


CTCGTAACGT 


8520 


45 


TTTCTCTATT 


GCTTCAAAAG 


CTTCCCAATT 


ACGTGCTGCT 


TTTCCAATAC 


CACCATAAAC 


8580 




AACTAAATCT 


TCTGGTCTTT 


CAGCAACTTC 


TGGGTCTAAA 


TTGTTGTATA ACATTCTAAG 


8640 




TACTGCTTCT 


TGT7CCCAAC 


CTTTACACTC 


AATACTCAAA 


CCTTTTTTTG 


CTTGAATTTT 


8700 


50 


TCTCATAAAA 


TTCGCTCCTG 


TTCTTTTAAG 


AAGTTAATTC 


CACTAAATTT 


AAAACGCTTA 


8760 




CATTATTATC 


TTCAATATTC 


ATTATAGTAT 


GTTAAAATAT 


AGCCAACAAA 


TATAAATAAA 


8320 




CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 


8380 
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CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 
GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT ACAGTTTTAT 
CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT GTATGATTTA 
AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 
GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 
ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA TTTAGCTATA 
ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA GGAATCTTAC 
ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT TGAAAATTTG 
CCACTCATAT TACCAAACAA AAATTCTCAA GTGCGCAAAC ACTTAGATGA CTATTTTAAT 
AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 
TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA ATCATTTCAC 
ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 
TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 
TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 
CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 

■ 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 
CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 
CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 
CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 
CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA AAATAATTTA 
TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATGCTT 
TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 
GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA TTGCTTTAAA 
TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 
TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 
TTTAGAAGGA AGAGTTAGAG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 
TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 
GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT ATAAAAAGGC CTCTTGAACT 
CCGTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA TACAGTTGTT 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 


10800 




CATTCTGCCC TGCAATGCAA 


ATCTCGTCAC 


ATATAAATAT 


TTTTAATTAT 


TTTAAAAAAT 


10360 


5 


GATGCACTAA ATTAGCAACG 


AGCTTAGCAG 


TTCTATTGTC 


AGCGTCATAT 


GTTGGATTCA 


10920 




TCTCAGCAAT ACTAACTGAA 


GACACCTTAT 


CACTTGGAAT 


AATACGTTTT 


GCTAATTCAA 


10930 


10 


GAACAGTATG TGGATACAAA 


CCTAACACTG 


CCGGCGCACT 


TACCCCAGGC 


GCAAACGCAC 


11040 




TATCAATGAC ATCCATACAA 


ATCGTAAACA 


TAATGACATC 


ATGTTCATGT 


ACAAAACGTT 


11100 




CAATCATATC TTTAATTGTT 


GGTGATACGT 


GACTCAATAA 


TTCATCTGCA 


AAGACATAAT 


11160 


15 


CAATCTTTTT CTCTTTAGCA 


TAATCAAATA AACTTTGCGT 


ATTACCACCT 


TGAGCAATAC 


11220 




CAAGCACTAA ATAATCTGTG 


TTTTCATCTT 


CTTCTAAAAT 


TTGTCTAAAG 


CTCGTTCCAG 


11280 




ATGTAGATTG TTGTTCAGCA 


CGTGTATCAA AATGCGCATC 


AATATTTATC 


ACACCAATAG 


11340 


20 


ATTGTGTTGG ATAGACTTTA 


CGTGTTGCTA AATATTGAGC 


ATACGCAATA 


TCATGTCCAC 


11400 




CACCTAATAA AAATGTTTGT 


CTATGATTAG 


CAATTGACTT 


CGCTGCAAGC 


ATAGCAAATT 


11460 


25 


CTTTTTGAGT ATCAATTAAT 


TCCTCATGAT 


CATGATAAAC 


ATTTCCGTAA 


TCGACTAAAG 


11520 


TTcACATTGA TTCAAATCCG 


GCAAACCTGC 


AAATGCTTGT 


TTAATCGCAT 


CTGGTCCTTC 


11580 




TTTTGCACCA ATGCGCCCCT 


TGTTTAAAGC 


AACACCTTTG 


TCAACAGCAT 


AGCCTAATAT 


11640 


30 


ACCGACCCCT GATGGCATAC 


TACTCTTTTC 


CAGCTTAGAC 


AAATCTTCAA 


ATGTTACTGT 


11700 




TTGAAAATGT CTAAATTTTT 


TCGGGTCTGT 


TTCACTATCT 


AACCTTCCAG 


TCCATAAATT 


11760 




TGGTTCACCT TGCTTGTACA 


CAGCATTTCC 


CCCTCTTATT 


TATGTGGCTT 


ATTAACAATT 


11820 


35 


AAAGTATAAC GTATAGGAAA 


TTTTGAATTC 


AATTCATAGt 


TAAATCCGTA 


TCTTAAAAAT 


11880 




ACTTATCTAC ATTACTTTTA 


CCCCTATTTT 


CTATGTAATA 


ACGAATACTT 


AGCTGATTTA 


11940 


40 


TGTTAATAAA ATACGTCAAG ACTATTACAT 


TTTCATTAAT 


ATTGACATAG 


ACAATTTATC 


12000 


TCTCGGCTTG TAATATGTAT 


AATTGTTACT 


AAAAGATATT 


TTGCTTGTTA 


CCTAATGGAG 


12060 




GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA 


CGATTTTCGT 


12120 


45 


CTTCTTTGTC ATTACAATGG 


CGTTATCCAT 


TATGCTCAGA 


GATTTCCAGT 


CTATAATTGG 


12180 




TGTCAAACAC TTTATATTTG 


AAGTTACAGA 


TCTAGCACCA 


TTAATTGCTG 


CAATCATTTG 


12240 




TATACTCGTT TTCAAATATA 


AAAAGGTCCA 


ACTTGCAGGT 


TTAAAATTCT 


CAATCAGCCT 


12300 


SO 


GAAAGTAATT GAACGTCTAT 


TGCTAGCTTT 


AATTTTACCT 


TTAATTATTC 
TTACAATCAA 


TAATTATTGG 
CAGGCTTATC 


12360 
12420 




TATGTACAGC TTTAATACAT 


TTGCAGATAG 


CTTTATTTTA 




AGTACCTATT ACACACATTC 


TGATTGGACA 


7ATTCTGATG 


GCGTTCGTAG 


TAGAATTCGG 


12480 
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TGTTGTTGGT 


7TGATGTATT 


CAGTTTTCTC 


AGCAAATACA ACTTATGGTA 


CAGAATTTGC 


TGCTTATAAC 


TTCCTTTATA 


CATTCTCATT 


CTCTATGATT 


CTTGGTGAAT 


TAATTAGAGC 


GACTAAAGGA 


CGTACAATTT 


ATATTGCAAC 


GACATTCCAT 


GCTTCAATGA 


CATTCGGACT 


TATTTTCTTG 


TTTAGCGAAG 


AAATCGGCGA 


TCTATTTTCA 


ATCAAAGTCA 


TCGCCATTTC 


AACAGCAATC 


GTTGCAGTAG 


GATACATTGG 


TTTAAGCTTA ATTATCCGAG 


GTATTGCATA 


TTTAACAACA 


AGACGAAACC 


TTGAAGAACT 


TGAGCCTAAT 


AATTATTTAG 


ACCATGTCAA 


TGACGATGAA 


GAAACTAATC 


ATACTGAGGC 


TGAAAAATCT 


TCTTCAAATA 


TTAAAGATGC 


TGAAAAAACA 


GGTGTAGCTA 


CTGCATCAAC 


GGTTGGTGTT 


GCTAAAAATG 


ATACTGAAAA 


TACAGTGGCT 


GACGAACCAA 


GCATTCATGA 


AGGTACTGAA 


AAAACAGAAC 


CTCAACATCA 


CATAGGTAAT 


CAAACTGAAT 


CTAATCATGA 


TGAAGATCAt 


GACATCACTT 


CGGAGTCAGT 


AGAATCAGCm 


GaATCAGTTA 


AACAAGCACC 


ACmAAGTGAC gA'iTraACAA 


ACGATTCAAA 


TGAAGATGAA 


ATAGAGCAAT 


CATTAnAAGA 


ACCTGCGACT 


TATAAAGAAG 


ACAGACGTnC 


ATCAGTTGTA 


ATTGATGCAG 


AAAAACATAT 


CGAAAAAGCT 


GAAGAnCAAT 


CTTCAGATAA 



A 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 854 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 

« 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAAITGTTC TTTAACTTGA ATTAAGTTTG 
ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 
AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 
TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 
AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 
ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 
AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 
AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 
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10 



20 



25 



AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 660 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 720 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 7 80 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 840 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 900 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 960 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 1020 

15 TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 1080 

TTGTCAAATT CTAAAATTAT AGCCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 1140 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT TAAATTCATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 1320 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 13 80 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAG CATAA 1440 

TACGATATCC CACCAGATAA AATAGATGaT ATTATCGGTA TGTATATATC ACCTTTCATA 1500 

TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 1560 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 1680 

25 GGGACCAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 1740 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACC CAGTGAC 1860 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 1920 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2040 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 2160 

50 CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 24 60 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG T CAT AGTTTT 2520 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2580 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 2640 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTAT CTGCAT 2700 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAATCTTTAT 2820 

75 TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2880 

CAC CAT AT AA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2940 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3060 

TTGCTTGCTT AATAACGCTT TTAGCTTTAT CTCCAACACT TACTTTATCT GGGAAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 3240 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3360 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 3480 

25 GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3540 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 3 600 

ATATTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 3660 

TTTTCAGCTT TATATT7CTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3720 

TTATCTTTAT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3780 

TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 840 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3 900 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TTACACTTAC TAAACCAGAA 3 96 0 

50 ACACGACCAA AAGCTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 020 

CCAAAAATAG TTTTTAACAA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4080 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 4140 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG 
TCTAATCGCG TTAAACGCCA ATCTTGTTCG 

5 

CCCACTTTAT TCAAATTAAA AAGCCATAAG 
TA CTTTT CTC CTGTAATAAT TGCATATTCC 
CACGCTATAT CTTCTTTACT ATATTCTTTC 

10 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC 

15 TGTACTTTGA ACAACTTGTT TCTGCATACT 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT 

ATAATCTTCG TTAAAAACTA TTTCCCCATT 
20 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT 
AGCGTATGAA ATTATCTCAT TACGCTTGTT 
ATAATTTTGT TAATTGTCCC TCTATTTGCG 

25 

TCGAAATAGA CATCGTTTGA TATAGTTAAA 
TCATAAACAC CTCCACCATT TCCATCACCA 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT 
AAAATTAATA AAATAAATTG GTCTAAACTC 
TTTGCTGTTC CATCAAAAAT AACCGAATAC 

35 TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT 
GTATTGTAAT CATTAATAGC TAATTCTGAC 
TTAACAAATA CTTTATTTGT ACCGTTCGGT 

40 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA 
GAAACATAAA TAAATTGGGT TGAATCACCT 
CCAGTAATCT GCATTATCTT ACCATCATCT 

45 

GTAGAACCAC TTGTGACTAA ACCACCACTA 
TCATCCATAT ATCGCTTTTG CTCATCGAAT 
50 AAATCAGATA TATGGCTATT AGCAAGTTGC 
ATTTGAATAT C TG AT AG AC C TTTTTCTTTA 
CCATTTTTTA TAGCCTCGTC CATTGCTTTC 

55 



AATTTTTCAA ACATAGTCTT ATCATTTTCT 4260 

TGTCGTTTGG TAAATCCAAA CATTACACCA 4320 

ATTATAACCT ATGACTCTAG ATTTTCTGGA 4380 

TCTTTATCTA TAACTTCCAT ATCTACATAC 4440 

AATTGATACC ATGTTTTAAT ATCTTCGAAT 4500 

CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4560 

ATTTTGCTTA TTAACTTGCA TCGATAACTT 4620 

AGCAACCATT TTTCGTAAGA TGTCATCAGA 4680 

CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4740 

TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4800 

ATCAATACCT TCTTCAAAGC CACCAATGAT 4 860 

AACTAATATT TGCATTATTT TCTCACTCCT 4920 

TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 980 

GATGTACGAC TAGATTTAGT TAATCCAAAC 5040 

TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

TCATTTAAGT ACAATGTAGA GCCCACACCA 5220 

CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 80 

GATTGGGCAC TTTGAACAGT TTCAAAAGGT 53 40 

CACTCAGACC ATGAACCCGC TTCTTTTCTT 5400 

CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 60 

CTAACACCTC TTGGATAATT TATAGCTTGC 5520 

ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

TTCACTGACT GCTTGAAGGC TTCATGTTTC 5700 

GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 5760 

TTTAATTCAT CTATACTTGA AGATTTTGCT 5820 

GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5880 

GCACGATCCA TAATAGTTTT TTCTAATTCC 5940 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 60 SO 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 61 BO 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6240 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6300 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 6360 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

15 TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 BO 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 6540 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TC CCATTTAC 6660 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6780 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6900 

TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6960 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 7020 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 7080 

25 TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 7140 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 7200 

TCTCCTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 7260 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7380 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTT CTAGGT 744 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7560 

50 TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 76 BO 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CAT AT AG CCA 7740 
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ATTACTGCAT TTGTAAgAGG TGCAAGTTCT GTCACAAATA AAAATTCTTG CTTATCAGGT 
TCAAAACGAT ACTCGATATC AAGAATTTCT TGTTTGGTCT TATTTAATTC TCTTATAGTT 
TCCTCTTTAT TAATTTGAGT TTTGGTTTCC CAATCGTCTA AATGTTCTTT TAATGTGTCA 
AAGGTTTCGC CGTTTACATT AACTCGAGCT TGAACAATCT CATTAGCACT GTTATTACGT 
GGTGCCACAA CAAGTGCGTT AATTTGACTT TGTAAAGATT TGTTTACTGC TGCTTGCGAT 
CTACCATTAT AATAAATTTG CTCAGCGAAG TGTTGAATTG TTTTAGCTyT CTGATGCAAC 
TTAAACTCTG TTGTCAAGCC AAGCGCAAAT TGCTCTATTC TTTGTAAGTT TTGTATTTCC 
TTAGCTCTAT AATCTCGACC TGCTAAAGCT CCCAAATCCT TTATTAAATA CAAATTTTCC 
ATAATGCACC TTCCTTTCTA ATAAAATAGC ACTGTACCAA GTTTCCCACT ATCGTCAACT 
GTTATTTTCC ACAATTTACC GTTTGGGGAT TTCTGTACAA TGCTATTTTG AATAATTgcC 
TGCtTCGCCT ATTTTTAAAT TATCTAATTT ATTTkTATCA TTT AC CG AAA TGATACCGTC 
TTGAGGCAAT CCATCAATAn CACTACTGCC TGCATAAGGT ATCCCATTTA TAGCTTTCCA 
ATGTGTAGCT GGAAAGTACT GTTTATCGT 
(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGACCAAAA 
AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 
GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 
GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 
AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 
AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 
ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 
TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 
GGATACTGCG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 
ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 
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TCTAGAAGGA ATAAACTTAA 


CTTACGTTGG 


AGATGGACGT 


AATAATATTG 


CGCATTCATT 


720 




AATGGTAGCA 


GGTGCTATGT 


TAGGTGTTAA 


TGTAAGAATT 


TGTACACCTA 


AATCATTAAA 


780 


5 


TCCAAAAGAG 


GCATATGTTG 


ATATTGcAAA 


rGAAAAaGCG 


AGTCAaTATG 


GTGGTyCAGT 


840 




CATGATTACG 


GATAATATTG 


CAGArcCAGT 


TGAAAaTwCm 


GATGCTATAT 


ATmCAGATGT 


900 


10 


TTGGGTATCG 


ATGGGTGAAG 


AAAGTGAATT 


TGAACAcGTA 


TTAATTTATT 


AAAAGACTAT 


960 


CAAGTGAATC AACAGATGTT 


TGATTTAACA 


GGTAAAGATT 


CAACGATATT 


CTTACATTGT 


1020 




TTACCAGCAT 


TCCATGATAC 


AAATACACTT 


TATGGACAAG 


AAATTTATGA 


AAAATATGGA 


10B0 


15 


TTAGCTGAAA 


TGGAAGTTAC 


AGACCAAATC 


TTTAGAAGTG 


AACATTCAAA 


AGTGT7TGAT 


1140 




CAAGCTGAAA 


ATAGAATGCA 


TACAATTAAG 


GCAGTAATGG 


CAGCAACATT 


GGGGAGTTAA 


1200 




TCACTAAATG 


GAACGATATG 


AATATGATGT 


GTCTGATGAT 


ATAAGTGTCA 


TGTACAGACA 


1260 


20 


CCTCATATTG 


GTATTAAAGG 


AGAAATGAAT 


ATGAACGAAT 


CAGGAGATAA 


CAAACTCAGT 


1320 




AAATCTTCTT 


TAATTGGACT 


AGTTATAGGA 


TCCATGATTG 


GTGGCGGTGC 


GTTCAATATA 


1380 


25 


ATGTCTGATA 


TGGGCGGTAA 


AGCCGGTGGA 


TTAGCCATTA 


TTATTGGTTG 


GATTATTACA 


1440 


GCTATAGGAA 


TGATTTCATT 


AGCGTTCGTA 


TTTCAAAATT 


TAACCAATGA 


ACGGCCGGAG 


1500 




CTAGACGGTG 


GTATTTATAG 


TTATGmTCAA 


GCAGGATTTG 


GCGATTTTGT 


AGGATTTATC 


1560 


30 


AGTGmTTGGG 


GATATTGGTT 


CTCAGCGTTT 


7TAGGCAATG 


TTGCCTATGC 


AACACTATTG 


1620 




ATGTCAGCAG 


TAGGTAACTT 


TTTCCCGATT 


TTTAAAGGAG 


GCAACACATT 


ACCAAGTGTT 


1680 




ATTGTCGCCT 


CGTTACTACT 


CTGGGGTGTC 


CATTTCTTGA 


TTTTAAAAGG 


CGTTGAAACA 


1740 


35 


GCAGCATTTA 


TCAATAGTAT 


TGTTACTGTT 


GCAAAGTTAA 


TACCGATTTT 


ACTTGTAATC 


1800 




ATATGCATGA 


TAATTGCATT 


CAATTTTGAC 


ACTTTTAAAA 


CAGGCTTTTT 


CAGTATGACG 


1860 


40 

TV 


TCAGAGGGTG 

* 


TATTGCCATT 


TAGTTGGGCG 


AGCACAATGA 


GCCaaGTtAA 


AAGTACGrTG 


1920 


CTAGTGACAG 


TTTGGGTGTT 


TATCGGTATC 


GAAGGTGCAG 


TAATTTTTTC 


TAGTAGAGCT 


1980 




nAAAATGAGA AAGATGTAGG 


TAGTGCCACG 


GTTATAGGAC 


TTATATCAGT 


TTTAATTATC 


2040 


45 


TATyTCTTAT TAACTGTATT 


AGCTCAAGGC 


GTGATTTTGC 


AAAATCATAT 


TTCGCAATTA 


2100 




GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG 


ATCTACACTT 


2160 




GTAAATATTG 


GTTTAATTAT 


TTCGGTACTA 


GGTGCATGGT 


TAGGATGGAC 


ACTGCTTGCT 


2220 


SO 


GGTGAATTAC 


CTTTCATTGT 


TGCAAAAGAT 


GGATTATTTC 


CAAAATGGTT 


TGCTAAAGAA 


2280 




AATAAAAATG 


GAGCACCTGT 


AAATGCACTG 


CTTATTACCA 


ATATATTAGT 


ACAATTATTT 


2340 




TTAATAAGTA 


TGCTATTTAC 


ACAGAGTGCG 


TATCAATTTG 


CATTTT C ACT 


AGCATCAAGT 


2400 
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CGACAGCAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGCCTCAAT TTATGCTATA 2520 

TGGCTTATAT ATGCAGCAGG TATCAATTAC TTATTATTGA CGATGTTACT TTATATTCCA 2580 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 2640 

TATATTCtTT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 2700 

GGAACGATAA ATGTTTTTTA AAAGGAGCGA CAAAAATATG AAAGAGAAAA TTGTCATTGC 2760 

ATTAGGCGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACAGCTAT 2820 

TAGATGTGCG ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 2880 

15 . ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAGCTAAAT CGAACAGTGA 294 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3000 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3060 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 3120 

AcCAaTTGGT CCTTTTTATA CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 3180 

CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTGcGT CACCACTACC 324 0 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 3300 

TGCATGCGGT GGTGGCGGTA TTCCAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3360 

30 AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 3420 

CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 3480 

ACAAATCGAT GATATTGATG TAGCAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 3540 

GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAGtG GGGaAACCAA 3600 



20 



25 



35 



40 



(2) ."INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
45 (D) TOPOLOGY: linear 



3601 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 60 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 120 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 3 00 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 540 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

- TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 360 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 420 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA AC C TTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

SO ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 1140 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

75 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 50 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 240 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 300 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 3 60 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 540 

35 TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 600 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 

50 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 
CAATAAGAAA 

(2) INFORMATION FOR SEQ ID NO: 10: 

55 
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(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 



75 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

10 TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 60 

ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 120 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 180 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 240 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 300 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 360 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 4 20 

AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 4 80 

25 AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 600 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 660 

AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAGCATC 720 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 780 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

TTATGAAGAA GGGAtCaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 900 

GACG 904 
(2) INFORMATION FOR SEQ ID NO: 11: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
4$ (D) TOPOLOGY: linear 



30 



35 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

50 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 60 
AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 120 
TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 30 0 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATGCCTT 360 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 420 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 480 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 540 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 600 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 660 

15 GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 720 

CTGTTACATC ATGTTGTTTT TTAAATGCTG TTAAAGCTTG GCATAACATT GAAAATTCAT 780 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 840 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT 900 

AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 960 

GCAAATGACT TAGTGCTTTG TTCATACCAA TATTATTGAA GCCCATTCGA TTTATCAAGG 1020 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 1140 

AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 1200 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG 1260 

GaAACTTTTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 1320 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 1380 

GAGGCTTACT ATCCTCAACT TAATATATGT GAAATATATT CTTTTAATAG ACTAGCATTT 1440 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 1500 

* 

CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AGCATTTACC TTTAAAAGAT ATTAAAGAGA 1620 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGTCGAAACA GATACGGATA 1630 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA GAAATTGCAG 1740 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 1300 

TTAACGGTGA TGCGTATTTA GTGATGACGT ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC I860 

GCCAATTAGG GCAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

GCTTCTCATT ACCTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 1980 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2220 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 234 0 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 2400 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 2460 

75 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2520 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2640 

AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 2760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2820 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 2880 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 2940 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 3060 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 3180 

GATGCGCTTA ATCATTCGAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 3240 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 3300 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3360 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 34 20 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 3480 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3540 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

50 ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 3 660 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3720 

AATAAAGAAA TAC TTTTT CT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3780 
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ATGT CATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3960 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4020 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGGCTA 4080 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 414 0 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4200 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4260 

1$ CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA. 432 0 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 4380 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4440 

20 TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4560 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 462 0 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 4680 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 474 0 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4800 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4860 

GCTTTGATGC AACAAAAAAT CCAATCG7TC AACAATTATT TGTGACAACA AATCAAGATA 4920 

35 TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4980 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 504 0 

TAC&GCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 
40 ATTATCAAAA AAATCAAATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT . 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 5280 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 5340 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 5400 

ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAGCATGGAT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 5580 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5700 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 5760 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 5820 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGrAGT TCTTGTAGTT ATCAAATAGG 5880 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5940 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6000 

-CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 6060 

75 CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATT AGG CATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 00 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 63 60 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 6420 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 80 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6540 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 6660 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6720 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 6730 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6840 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6900 

TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7080 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 7140 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 

50 CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7260 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 7320 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 7380 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7500 

GAGAATGGTT AACATGCCTT CATGTATAAT AACGAGTTGA TTTGAACGTT TAAGCGTAAA 7560 

TAAAAATAAG CTTGGTCAGC CATCAAATAT AATTTGAAAA CTGTCCAAGC TGTTTTATTA 7620 

GAGAACAATC AATTAACCCC ACATATTTAA TAATACATCA GCAAAGCCTT CAGGTTTTTG 7680 

AATATAACCT AAGTGACCGC CTGGAATATC TACAATAGGT ATGCCAGTTT CTTTATTTAT 774 0 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC TCTAGAATCT GTCCCATTTA GTAGGGTGAT 7300 

TTTATCGCTG TATTTTGTGA AATCATCCAA AGTAATATCT GAATGCGTAT ATTGTCTAAT 7860 

TTCAAATTCT GACCAGAACA TCGTACGTTT GTACTGTTCT ATACGTCCTT CTTCAGTATC 7920 

AGCAGGTTGA GACATCATTT TTGCATCAAT TGGTGCGATA TTTAATGTTT CGCCAAATGT 7980 

TTTCATGCCT TTTTCTAAGC CTTCTGTTAA AATTTGATGC ACAATGTCAT CATTTTTATC 8040 

20 TTTCCAATAA GTACTGTCTG GTAAAAATGT ATTAATTGGT GGTTCGTGAA ATGCAATCTT 8100 

TTTAACGACT TCAGGGTAAT CTTTTAACAC ATGCATCGCA ACGATTGAAC CTGAACTTGA 8160 

ACCTAATATA TAGACAGGTT CATCACTTAA TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 8220 

GTCGCGTTTG ACACGATAAT CACTGTCAGG GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 8280 

TAACTCGCTT TCTCCATAAT CACGACGATC AACGGCTACA ACAGTAAAAT GGTCTTTTAA 8340 

CTGTTCTGCA AGAGGCAGAA AAATGTCTCC GGTACCGTTT GCACCAGGAA TAAAGATGAG 84 00 

CACGGGTCCT TGTCCGACTT GGTGGTATCG TAATTTAGCG CCTTGTAATT CTAAAGTTTC 84 60 

CATATTCAAT GACCTCCATT TGTTAATTGT TAGGTGATAA ACCTAATAAT TTAGCACCAT 8520 

35 TTGTATAACT TATTTTCTCT TTTTCTTCAT CTGTTAAACC CAGTTCATCT AAAAATACAC 8580 

CTAATTTTTC AGGCTCAATA TATGGATAAT CAGCAGCATA AAGAATTCTA TCAATACCTA 8640 

CTTCTTTCTT GACTAAATCA AACTGTGGCT TCGTTAACAT GCCACTCGGT GTGATATAAA 8700 

» — 

40 AATTATTTTT AAAGTAATAG CTTACAGGGT GGTTCAAATG TTCAGCGAAT AAAGCTTCAT 8760 

CCATACGTTC TAAGAAGAAT GGGATAAACT CACCCCAATG TCCAATAATC ATATTTAACT 8820 

TTGGATAACG ATCAAAAATA CCAGATAATA CTAGATGTAT TGTATGAATG CCGACATCAA 8880 

TGTGCCAACC ATAACCAAAA CAAGCAAATG TTGCCGCAGT TACTTCAGGA TAATTTCCTT 8940 

TATAGTATGA TTGATAAATG TCACTGTTAA CTGGCGCGGG ATGTAGATAA ATCGGTACGT 9000 

CTAAATTTTC AGCTGTTTTG AAAATAATGT CATATTTGTC TTGATCAAGA AAACCATCTT 9C60 

GTGCACGTCC CATAATGAGC GCACCTTTGA ATCCTAAATC ATTGATGCAA CGTTCGAATT 9120 

CTCGCGCTGC GGCTTCAGGC TCATTGATAG GTAAAGTTGC AAAGCCTACA AAGCGATTGG 9180 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 9300 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATTCGTCG GCATTTGTAA 9360 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 9420 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 9480 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 9540 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 9600 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTGCTAAT 9660 

TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 9720 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 9780 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 9840 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 9900 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 9960 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 10020 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 10140 

TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 10200 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 10500 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

50 ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 10360 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 
GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 
TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 
GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



11100 
11160 
11220 
11271 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



20 


CAACCCGTTC 


AGAACAAAAT 




GAAAAACTTT 


TTTACATCAT 


* 


TGTTTCATAA 


AATGTAACTT 


25 


ATTTCAATTT 


CACCGTTTTC 




TCTTCATCGC 


CGCCACGGTT 


30 


TCTTGCTCAA 


TCGAACCAGA 




TGTTCAACAC 


CACGAGATAA 




AATGCTTTTA 


ATGTACGAGA 


35 


GAACCACTAC 


CTTGAATCAA 




TGCTTTAATC 


GACGACATTT 




ATAAAAATCT 

• 


TCGTACGTGA 


40 


TCAGTCATAG 


TACCCGTTCT 




CGTGTGGCTA 


ACTGATCAGC 




TCATGCGTTG 


CAACTTTTTG 


45 








CGCGCTGCAA 


GGATAATTAA 




CGATATCCTG 


TAGGTATACC 


50 


TCATACACTT 


GTCCTAAGAC 




GATAGCTCTA 


AAATTCGACG 




TTATATCCAT 


CATTGGCAAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 1260 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 1320 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 1380 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 1440 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 1560 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

75 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 1740 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

20 CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 1920 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 204 0 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2220 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 22 8 0 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 234 0 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 24 0 0 

GATCCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 24 60 

GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 25 80 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 2640 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 27 00 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 27 60 

50 TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 28 30 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2940 

55 



30 



40 



45 



251 



EP 0 786 519 A2 



10 



15 



25 



CCATAGAAAC GCACATTACC ATTAATACTT TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3060 

AATGCTAAGT CTAGGCCTGA TTGTGATAAT TCACCTAAGT CGATTAAATT TTCAGTACCT 312 0 

TCACCAACAC CGATACTTAA TGTTAATTGG GCACGATAAC CAACACTTTT TTCACGTAAT 3180 

TGACTCAAGA TATCAAATTT AGATTCTTCT AAGTCAGCTA ATATTTTTTG ATTTAAATAG 324 0 

GCTACGAATT GATCGGAACT GTATCTTTTG AAAAATATAT TATACTCAGT TGCCCATCGA 3300 

CTAATGACAC GCGTTACCAT TGAGTTGATT TCCGAACGCT GCGTATCATT CATATTTTGC 3360 

GTAATCTCAT CGTAGTTATC TAAAAATAAT GTCGCAATGA TTGGTTTAGA ATTTTCATAT 3420 

AGTTCATTTG TTTGTACTTG TTCAGTTATA TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 80 

GAATAACGTA CTTGGAAATG ATACTGATTA TATTCTATTT cAACGGATTT CACTCTATCT 3540 

AATTGCTTTA AAATGTTTGG AAATACTTCA TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

20 ATATGATCTG TCATAAATTG GTTAACCCAT TCGATGTGAT CATTTTCATC TAAAACAATG 3660 

ATACCAATTG GTAAATGTTT GATTGCTTTA TTATTTGTTG TTGAAATTTG AGCACTCAAA 3720 

CCATCTACAT AACTATCCAT TTTCATTAAA GCTTGTCTGA ATAAAATGAT GCTAACAATA 3780 

ATCATCACGA CAAGAACGAT AGATGCAATT AGTGCTATAA GACTATTAAA GATAAACCAT 3840 

ACACCCATTA AAACAATTGC TGTGATGATC ATGATGACAA ATGGTATTAG TAAAGCTTTC 3900 

TTAGTGGACT GCCGATTCAT TATTCCACCT CTATTCACTT TTTAGAATTA TTTTTCATGA 3960 

TTCGCTTCAA ATTCAAACTT AAATCGATAA CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

GTGTCAGTAT TGTACCGATA ACCAATAGTA AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 0 BO 

CTTTACCAAA GAAATGAATA ACACTTAAAC CTTGAATATA CATTACTAAT GATAACACAA 4140 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA CACTCGGTTG ACCTGTAAAT AATAAACATA 4200 

TGATAACAAT AATGTATATC CATAATAAAA TACCGCTCAT TTGCCACGCG AAAAGTGGCT 42S0 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC GTAAAATCGG AAATGTAACG ATTAAGTTAA 4320 

TTAAGACGAT TAAAAATGTA ATGATAATGA TGAAACCTGG TAATTGAACG GTCGCTTGTC 4330 

TAAACCCTTC TTCTAATATT TGGGTCATAT TCGCATCGGC ACCGCTCATC GTAATCGCTT 4440 

CATGTAATGT TTGCTTGAAA GGTTTTACTA TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4500 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA TTnArCTCAT CGCTACTGTT GTTACGTATA 4560 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA GCAATTGACC AATAATTAAA CTTGCAATTA 4620 

AGACTAATAT GATGGCACTT AAAACGAAAG TATTACCTAA AACAGTTGTT ATAATTACTG 4 680 

TAATAAGTGC ACTAATCCCG AAAGATTGTA TTGATTTATT CCATAAAACG ATACCTGGTA 4740 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGCGATTAA CTCTGGAAAT ATCTTTTCCA TATTTACGTn TTAAATTATT CAGCAAATTC 60 

ATACGAGaTT CATACTCGTT yAACACTTGT TCGTCGAATT CTGTATTAGC CATTTCATCA 120 

TATAACTCAT GTTTTGCATC TTCTAAAATG TAGTAAAATT GATCAATATC TTCTTTTAAT 180 

TTGTCATATT TGTTTGGAAC TATATCGTTT ATTGTTAACA AATGGTTGCT TAGTTCATAT 240 

AAACGATCAG TGATAGCATT TTCATCCGTT AATGTCATAT ATGCGTTATT AAGCGCTAAG 300 

CTTAATTTTT CAGAGTTTTG AATGCGTTTA ATATCTATTT CAAGTTGCTC TATTTCGCCT 360 

TCTTTTAGAT GTGCTTCAGA CAATTCTTCT AATTGGAATT TCATTAAATC TAAACGCTGT 420 

AGCAATGCTT GGTCTGCTGA TTCTAAATCT TCTAACTCTT GCTTTTTGGC TTTATAATTT 480 

TGAAAAGTTT GGTGATATTT ATCCAACAAA TCTTGATAAC GTGATTCTGC GTAATTATCC 540 

20 AATAATGTTA AATGGTATTT TTGTTTCAAC AAAGACTGCG TTTCATGTTG GCCATGAATA 600 

TCTAATAATT CTTGCATAAC TTTTCGTAAA TCTTGTAAAG TAACTGTTTG ATTATTAATT 660 

TTACAAAGAC TTTTACCAGA GCTGAAAATT TCCCGTTTAA CTAATAAAAA ATCTTCATCT 720 

ACATCAATAT CCATATTTTT CAATATATGT ATAGCATCTT TACTCTCGTC AATATCAAAT 780 

ATACCTTCGA TGACAGCCTT TTTTTCACCA TGTCTTACAA AATCAGATGA AGCTCTCATT 840 

CCAATTAATT GTCCAATTGC ATCTATAATA ATTGACTTAC CTGAACCCGT TTCACCACTT 900 

AAAACAGTTA AACCATCAGA AAATTGAATT TCTAATTCTT CAATAATAGC AAATTGCTTG 960 

ATTGATAAGG TTTGTAACAT AAACTCATCG CATCCTTATA ACAAATTGAA AATTCTTGAC 1020 

TTGATTTCAT CACTTGCCTC TTTGCTTCGA CAAATAATTA AACAAGTATC ATCACCACAA 1080 

ATTGTGCCTA GTACTTCTTC CCAATTGATT TGGTCTAATA TAGCTCCAAT AGATTGTGCA 1140 

TTACeAGGTA TGTTTTTAGA ACAAGTAAAT TATCAGTACC ATCTATATTA ACAAAGGAAT 1200 

40 CCATTAAATA ACGTCCCAAT TT 1222 
(2) INFORMATION FOR SEQ ID NO: 14: 
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SO 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 50 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 240 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 300 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 3 60 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 420 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 540 

CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 780 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AGCTAAAATG CCACCATGAA CAGCCGGATG 1020 

T 1021 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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40 - <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 240 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 3 00 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA 
TGTTGAACAT GATTCGCAAC TTCTTCTCTA 
GCATCTTCTT CTGCATCTAT AATATACCAA 
TGCTCATAAG CATAAGAATT ATCAGGGTGC 
ACTATTTTAG TTAGAAGCGG AAAATCTTTG 
TCTGACCAAA TACGGTCTAA TGTTTGACCT 
CCATTTGGAT GTGCTGACAC ACACCAACAT 
TCCAAACTCA CTTAGACGTT GACCGCCCCA 
TAATGGCATT GTTGCACCTC CATTGTGATT 
TCCATTATAT TTTGATTTTG TTCTCATTTA 
TAACTATTAG TGATTGTACC ATATTTACTA 
CACTTAAATT TACAGTACTT TAACATTTTC 
CTTACATTTG TACATATTTC CCTTTAAATT 
CTTTAATAGT TGTGCCATAC ATTGTTCAAA 
TTATTCATAC TTATAATTCA TCATTTTCAA 
TTCAAATCAT ATTTACTATC CTTATTAATC 
TTTAATGTCC TGATCACCAC TAATAATTTG 
GACAATTTCT TTTAATACTG TCGCAACATC 
ATATTGTGCA GCTTCTATCT TTCCAGATCC 
AATTGTATAA TTCAAACCTG nAACGTCTTA 
TATATGGCTT TAAATCACCG CTATCATCAA 
CCATGACATA GTGTTTAATA TTGGCCTCTT 
CTAAATCGAC AATAATTGTT TTATCTGCAC 
TAACTTTATC GAATGGTTTA AACGTCTCAG 
CAACAAGAAT TGCTTTCATA CCTTGTGATT 
CACCAGCAGT AAATGGTACA TTTTCTTTTG 
CGCCATTAGC ACCTATAACC AAAATATTCA 
ATGCCATACC ACTTTATGAG ATATGTAAAA 
ACTACTGGGA ACGTATTAAA TTAATATATG 



AtatATCyAA gTAtCGaCCC tATCGTTCCG 
GACTCTGCTA ATGTCCCtAT AACTATTTCT 
CATTCAGATT TGCCATATTG CCCgTTTTCA 
ACATGAATAG AAAGTGATTC TCTTGCATCC 
CTTGGGAAAT CACCAAACAA TTCACGATGT 
TGATATGGTC CATTAATAAT CTCGCTCGTA 
TCCCCCAGTT GTATCATTGT CTAATTGATA 
TAATTTTGTT TTTAAAATTG GTTGTAAAAA 
AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 
CATCGTATTA TTAACTTCCA CATTTCAAAT 
ACATTGCAGT ACTGCCAATT AAAAGnGCTT 
AAAAATTTAT AGCATAGAGA TTATATCTCT 
TACTCGCCCA TTATACCAAT TAATAaACAA 
TTCTTTGTAA AACGCATAGA CAATACGTAC 
AAAATAACGA GTTACGAAAA AGTAACCCGC 
CGTTTCATTT TCAAATTGAG TTAAAGCATC 
AAACTCTTGG TGATTAAAAT GATTGGATGT 
TTCTCTAGGA ATTTCACCTT TACCATCAAA 
TGCTGCATTT GTAAGTGCCC CTGGATGTAA 
AATAGTCATC AGCGTAATGT TTAGCTATTG 
AAGCCTGACG TCTCGAATCA TATGTTGAAA 
TACTCGCAAT CATTGATTTA ACAGCACCAT 
CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 
TTAAAGTCTC TATTGAATCA TTTTCAACAT 
TTAACGCATT AAGTTGATCT GATTGCCTAA 
CTAATTGTTG CACTAGTAAC GAACCTACAC 
TTTACAACAC TCTCCTATkT ATTATTCTCT 
CTTGTTACAA CTATAAAAAT CAATTGACAT 
AACAAATATT CATATGAAAG GATTGTCATA 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 2340 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 2400 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 2460 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2520 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 2580 

. GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 2640 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 2820 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAA? GTTAATCATT AGTATTATCG 2940 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 3060 

CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 3240 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 3 300 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 3420 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 3540 

40 CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3720 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3 7 59 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 



30 



35 



257 



EP0 786 519 A2 



10 



15 



20 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 

TAATTATCGC GCATAACAAA ACATTAGCAG GACAATTATA TAGTGAGTTT AAAGAATTTT 60 

TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATTATCAn CCAGAGGCAT 120 

ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGCCTC AATCAnTGAT GAAATTGATC 180 

AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT ATTATTGCTA 240 

GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA GTAGTAAGTG 300 

TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTc AGATGTGCAA 36 0 

TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG TGATGTAGTG 420 

GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT CGGCGATGAG 480 

ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA AAGAGAACAT 540 

TTTGCGATAT TCCCAGCTTC TCACTTCGTA ACACGTGAAG AAAAGTTGAA AGTTGCGATT 600 

GAACGTATTG AAAAAGAATT GGAAGAACGA TTGAAAGAAT TACGAGATGA GAATAAATTA 660 

CTAGAAGCGC AAAGGTTAGA ACAGCGTACC AACTATGATT TAGAAATGAT GCGAGAGATG 720 

25 GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC ACTGGGTTCG 7B0 

ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT TGATGAATCA 840 

CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG TAAACAAGTT 900 

30 TTGGTGGATC ATGGGTTTAG ATTACCGAGT GCATTAGATA ACCGTCCACT TAAATTTGAA 960 

GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG ACCATACGAA 1020 

ATTGAACATA CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT ACTGGATCCT 1080 

AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA AATTCAAACA 1140 

AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 1200 

aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG AAATCAAGAC 1260 

ATTAGAACGA ATTGAAATAA TTAGAGACTT ACGAATGGGT ACATATGATG TTATCGTAGG 1320 

TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG TCATATTAGA 1380 

4$ TGCAGATAAA GAAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA TAGGTAGAgC 1440 

TGCGCGTAAC GATAAaGGTG AAGTCATTAT GTATGCCGAT AAAATGACTG ATTCGATGAA 1500 

GTATGCAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA ATGAAAAACA 1560 

50 TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG CTACTGTTGA 1620 

AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA CGAAAAAAGA 1680 
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TTTCGAGAAA GCTACAGAAT TAAGAGATAT GTTATTTGAA TTAAAAGCAG AAGGGTGACA 1800 

AGTAAATGAA AGAACCATCC ATAGTAGTAA AAGGTGCTCG TGCGCATAAC TTGAAAGATA I860 

TTGATATCGA ACTACCTAAA AaTAAATTAA TTGTTATGAC AGGTTTATCT GGGTCAGGTA 1920 

AATCGTCATT AGCATTCGAT ACTATATATG CTGAAGGACA ACGACGTTAT GTTGAATCAT 1980 

TAAGTGCCTA TGCGCGTCAA TTTTTAGGCC AAATGGACAA ACCAGATGTT GATACAATTG 2040 

AAGGATTATC GCCAGCAATT TCAATAGATC AAAAAACAAC AAGTAAAAAT CCAAGATCAA 2100 

CTGTAGCAAC AGTAACAGAA ATATATGATT ATATACGTTT GTTATATGCA CGTGTTGGTA 2160 

AACCTTACTG TCCAAATCAC AATATAGAAA TTGAATCGCA AACAGTACAA CAAATGGTTG 2220 

ACCGCATTAT GGAATTAGAG GCACGTACAA AGATTCAATT ATTAGCACCT GTCATCGCTC 2280 

ATCGTAAAGG TAGTCATGAA AAGCTAATCG AAGATATTGG TAAAAAAGGT TATGTACGTT 2340 

TAAGAATCGA TGGCGAAATT GTTGATGTAA ATGATGTACC TACTTTAGAT AAGAACAAGA 2400 

ATCATACAAT AGAAGTTGTT GTAGACCGAT TAGTTGTTAA AGATGGAATT GAAACACGAC 24 60 

TAGCTGACTC TATAGAAACT GCCTTAGAGC TTTCAGAAGG ACAATTAACA GTCGATGTCA 2520 

25 TTGACGGGGA AGACCTTAAG TTTTCAGAAA GCCATGCTTG TCCTATATGT GGATTTTCAA 2580 

TCGGAGAGTT AGAACCAAGA ATGTTTAGCT TTAACAGTCC TTTTGGTGCT TGTCCGACAT 2640 

GTGATGGCTT AGGCCAAAAG TTAACAGTCG ATGTAGACTT GGTTGTTCCC GACAAAGATA 2700 

AGACGCTAAA CGAAGGTGCA ATAGAACCTT GGATACCGAC GAGTTCTGAT TTTTATCCAA 2760 

CATTGTTAAA ACGTGTTTGT GAAGTTTATA AAATCAATAT GGATAAACCT TTTAAAAAGT 2820 

TAACAGAACG TCAACGTGAT ATTTTATTGT ATGGTTCTGG TGACAAAGAA ATTGAATTTA 2880 

CATTTACACA ACGTCAAGGT GGTACTAGAA AACGAACAAT GGTTTTCGAG GGTGTAGTTC 294 0 

CTAATATAAG TAGACGATTC CATGAATCTC CTTCAGAATA TACACGTGAA ATGATGAGTA 3000 

AATATATGAC TGAACTACCT TGCGAAACTT GTCATGGAAA GCGATTGAGT CGTGAAGCkT 3060 

TA7CTGTTTA TGTAGGTGGT TTAAATATTG GTGAAGTAGT CGAATATTCA ATCAGTCAAG 3120 

* 

CGCTGAACTA TTATAAAAAC ATTGATTTGT CAGAACAAGA TCAAGCGATT GCAAATCAAA 3180 

45 TATTGAAAGA AATTATTTCC CGACTCACTT TTTTAAATAA TGTGGGACTT GAATATTTAA 3240 

CGTTAAACAG AGCTTCAGGT ACACTTTCAG GTGGTGAAGC ACAACGTATT CGATTAGCAA 3300 

CGCAAATTGG GTCGCGTTTG ACTGGTGTCT TATATGTATT AGATGAGCCA TCAATTGGAC 3360 

SO TGCATCAAAG AGATAATGAT CGATTAATTA ATACACTTAA AGAAATGAGA GATTTAGGAA 3420 

ATACTTTAAT TGTAGTTGAA CACGATGATG ATACAATGCG TGCGGCTGAT TACTTAGTGG 3480 
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AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3600 

AAGTACCTGA ATATCGCAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 36 50 

5 GCAACAATCT TAAAGGGGTT GATGTGGACA TACCACTATC AATCATGACG GTTGTTACAG 3720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 3780 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3840 

10 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3900 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3960 

CTAAAATTCG AGGATATCAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4020 

15 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 4140 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 4200 

20 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 4260 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 4320 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 4380 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 4440 

ATGGTGATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4560 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4680 

35 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 474 Q 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4800 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4860 

40 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4920 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4980 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 504 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGGCGATA TGAACCATGT AAATTAAGCA 5160 

50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5220 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 5280 
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ACATTAAAGT TAGATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 5400 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA S4 60 

ATACAACTAT T AGG AACAAC GGAACTATCG TTTTACAATT TATTACCAGA TAAGGATCGC 5520 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACGCCTG CAATTATTGT GACACGTGGA 5580 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 5640 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 5700 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 5760 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGGCAT 582 0 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 5880 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 594 0 

ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6000 

TTGGAAAACT GGAACAAGCA AAAGTTATAT GACCGCGTAG GTCTTAATGA AGAGACGCTA 6060 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 6120 

25 GCGGTAATTA TTGAGGTCGC TGCAATGAAC TATCGATTAA ATATCATGGG CATTAACACG 6180 

GCCGAAGAAT TTAGTGAAAG ATTAAATGAA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 6300 

30 CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 6360 

TTgCACAACG TGCACTAGTT AAAGCAGGAT TACATAAAGA TACTTTAGTA GATATTATTT 64 20 

TTTATAGTGC ACTATTTGGA TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 6480 

CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG GCATGGTGGA ATAGCAATAC 654 0 

ATGGTGGTTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 6600 

# 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 6660 

GCTGGGGTAA CTTTATGAAT CACGAGGCAC ATGGTGGATC GGTGTCACGC GCTTTTTTAG 6720 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC CAATATTATC 6780 

ATCCAACATT CTTATATGAA TCCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 6840 

T7CGTAAACA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATTCAATTG 6900 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 6960 

50 TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 7020 

GGATTAAGTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT TCGAAAGTTT TTAAGAATGT AATTATCATT 7200 

GAATTTTCGA AATTTATTCC AAGTATGGTA CTGAAAAGAC ATATATATAA ACAACTTTTA 7260 

AATATTAATA TCGGTAATCA ATCGTCGATA GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

CCAGAACTGA TTACGATTGG TAGTAACAGT GTTATTGGTT ACAATGTAAC AATTTTGACG 73 80 

CATGAAGCAT TAGTTGATGA ATTTCGTTAT GGACCAGTGA CGATAGGATC TAACACTTTG 744 0 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT ATAACGATTG GTGACAATGT AAAAGTTGCA 7500 

GCTGGTACGG TTGTTTCAAA AGATATACCG GATAATGGAT TTGCATATGG CAACCCTATG 7560 

TATATAAAAA TGATTAGGAG GTGACAATTT TATGGCGCAA AAGAATAATA ATGTAATTCC 7620 

AATGACTTTT GATGATGCAT TTTATCGTAA AATGGCTAAA CAGAAGTTTA AACAAAGAGA 7680 

ATATAAACGA GCTGCTGAAT ACTTTGAAAA AGTGTTAGAA TTGTCACCTG ATGATCTGGA 7740 

AATTCAAATT GATTATGCAC AATGTCTAGT GCAACTTGGT ATTGCTAAAA AAGCAGAACA 78 00 

TTTATTTTAT GACAATATTA TTTATAATAG GCATCTAGAA GATAGC T T TT ATGAATTGAG 7860 

TCAGCTCAAC ATTGAAGTTA ACGAACCAAA CAAGGCATTC TTGTTTGGTA TTAATTATGT 7920 

25 TATTGTTAGC GACGACCAAG ATTATAGAGA TGAATTAGAT CAAATGTTTG ATGTGAAATA 7930 

TCAAAGTGAA GAACAAATTG AACTTGAAGC TCAATTGTTT GTAGTTCAAA TACT ATT CCA 8040 

ATATCTTTTT TCTCAAGGTC GATTAAAAGA TGCAAAGAAT TATGTCTTAC ATCAACCACA 8100 

30 AGAAGTTCAA GATCATCGTG TAGTACGTAA TTTATTGGCA ATGTGTTATT TATATCTCGG 8160 

TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT B220 

ATATGCATTA TGCCATTATA CTTTGCTACT TTATAACACT AAGGAAAATG AACAATATCA 8280 

AAAATATTTA AAAATATTAA ACAAAGTTGT ACCTATGAAT GACGATGAAA GTTTTAAATT 8340 

AGGTATTGTA TTAAGTTATT TAAAGCAGTA TCGTGCATCA CAACAATTGT TGTACCCTTT 8400 

ATATAAAAAA GGGAAATTTT TATCAATTCA AATGTACAAT GCTTTAGCAT ATAATTATTA 8460 

TTATTTAGGT GAAGAAGACG AAAGTCATTA CTACTGGGAT AAATTGAAGC AAATTTCTAA 8520 

AGTGGAAATT GGACATGCGC CTTGGGTAAT TGAAAATAGC AAAGAAGTTT TTGACCAACA 8580 

TATTTTGCCA TTACTTCAAA GTGATGACAG TCATTATCGT TTATATGGTA TTTTTTTATT 8640 

GGATCAATTA AATGGTAAAG AAATTGTGAT GACGGAAAGT ATTTGGCAGG TTTTGGAAAA 8700 

TCTAAATAAT TATGAGAAAT TGTATTTAAC GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8760 

SO ATTAGACTTC ATTCATCGCG GCTTATTAAC GCTTTACCAT AATGAATTAT TTGTAAGTGA 8820 

AAATGATGTA ATGGTTGCAT GGATTAATCA AGGTGAACTC ATAATTGCTG AAAAAGTAGA 8880 
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TCGAAACGTT ACAAAGAAGC AAATTACAAC 
CAAAATGATT GAATTTCTCT TGAGCATATA 

5 TGCGCATAAT GATTAATAAT GAGGAGGCGT 
AGCAATTATC GGTGCAGGTC CAGCTGGTAT 
TTTAAAAACA GTTATGATTG AAAGAGGTAT 

10 AGTAGAGAAC TTCCCTGGTT TCGAAATGAT 
TGAACACGCT AAAAAGTTTG GTGCAGTTTA 
TAAAGGCGAA TATAAAGTGA TTAACTTTGG 

15 

TATTGCTACA GGTGCAGAAT ACAAGAAAAT 
ACGCGGTGTA AGTTATTGTG CAGTATGTGA 
CGTTATCGGT GGTGGTGATT CAGCAGTAGA 

20 

CAAAGTAACA ATCGTTCACC GTCGTGATGA 
AGCATTCAAA AATGATAAAA TCGACTTTAT 

25 AAAAGACGGC AAAGTGGGTT CTGTGACATT 
ACACGAGGCT GATGGTGTAT TCATCTATAT 
AGACTTAGGT ATTACAAATG ATGTTGGTTA 

30 AGTACCAGGT ATTTTTGCAG CAGGAGATGT 
TGCTACTGGC GATGGTAGTA TTGCAGCGCA 
CGATCAAGCT TAATTCGAAG TCGAATTAAG 
TTTTAATAGT GTCA7CACAG CGTTAAAATA 
ATAGSAAACT AGAACTTAGT ACGTATCATT 
TATGTTATAT TAAACTTATA ACTTTATGGG 

40 

TGATTTATTA TGTAGTGGTT CTTAAACATT 
AATACATGAG TAAAACTCAT GCATAAGAAA 
ATCG7TGTCC CACCCCAACT TGCACATTAT 

45 

TGGGGCCCCG CCAACTTGCA CATTATTGTA 
GCCCCGCCAA CTTGCACATT ATTGTAAGCT 
SO CGCCAACTTG CATTGTCTGT AGAAATTGGG 
CCCAACTCGC ATTGCCTGTA GAATTTCTTT 
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ATGGTTAGGC ATAACACAAT ATAAACTGAA 9000 

GATTTATGAA AAGTTAGATT TATTATATAA 9060 

TAATAAAATG ACTGAAATAG ATTTTGATAT 9120 

GACTGCTGCA GTATACGCAT CACGTGCTAA 9180 

TCCAGGCGGT CAAATGGCTA ATACAGAAGA 9240 

TACAGGTCCA GATTTATCTA CAAAAATGTT 9300 

TCAATATGGA GATATTAAAT CTGTAGAAGA 9360 

TAATAAAGAA TTAACAGCGA AAGCGGTTAT 9420 

TGGTGTTCCG GGTGAACAAG AACTTGGTGG 9480 

TGGTGCATTC TTTAAAAATA AACGCCTATT 9540 

AGAGGGAACA TTCTTAACTA AATTTGCTGA 9600 

GTTACGTGCA CAGCGTATTT TACAAGATAG 9660 

TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 

AACGTCTACA AAAGATGGTT CAGAAGAAAC 9780 

TGGTATGAAA CCATTAACAG CGCCATTTAA 984 0 

TATTGTAACA AAAGATGATA TGACAACATC 9900 

TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

AAGTGCAGCG GAATATATTG AACATTTAAA 10020 

ATGTTGAGCT GTAAATTATT TGGATATTTA 100 80 

ATGTCTTACT TTTAAATTAA AGCAAATTAT 10140 

TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 10200 

AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

AGCCACAGCT AATGTGTACT TAAAAATAGG 10320 

TACTAATTTC TATAGAAAAA GTATTACTTT 103 80 

TGTAAGCTGA CTT7CCGCCA GCTTCTGTGT 10440 

AGCTGACTTT TCGTCAgCTT CTGTGTT GGG 10500 

GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 10560 

AATCCAATTT CTCTATGTTG GGGCCCACAC 10620 

TCGAAATTCT CTGTGTTGGG GCCCACACCC 10 680 
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ACTCGCATTG 


CCTGTAGAAT 


TTCTTTTCGA AATTCTCTGT GTTGGGGCCC 


CTGACTAGAG 


10800 




TTGAAAAAAG 


CTTGTTGCAA 


GCGCATTTTC 


ATTCAGTCAA 


CTACTAGCAA 


TATAATATTA 


10860 


5 


TAGACCCTAG 


GACATTGATT 


TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 


10920 




ATTTAATCTA 


GACATAGTTG 


GAAATAAATA 


TAAAACATCG 


TTGCTTAATT 


TTGTCATAGA 


10930 


70 


ACATTTAAAT 


TAACATCATG 


AAATTCGTTT 


TGGCGGTGAA AAAATAATGG ATAATAATGA 


11040 


AAAAGAAAAA 


AGTAAAAGTG 


AACTATTAGT 


TGTAACAGGT 


TTATCTGGCG 


CAGGTAAATC 


11100 




TTTGGTTATT 


CAATGTTTAG 


AAGACATGGG 


ATATTTTTGT 


GTAGATAATC 


TACCACCAGT 


11160 




GTTATTGCCT 


AAATTTGTAG 


AGTTGATGGA ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 


11220 


AATTGCAATT 


GATTTAAGAG 


GTAAGGAACT 


ATTTAATTCA 


TTAGTTGCAG 


TAGTGGATAA 


11280 




AGTCAAAAGT 


GAAAGTGACG 


TCATCATTGA 


TGTTATGTTT 


TTAGAAGCAA 


GTACTGAAAA 


11340 


20 


ATTAATTTCA 


AGATATAAGG 


AAACGCGTCG 


TGCACATCCT 


TTGATGGAAC 


AAGGTAAAAG 


11400 




ATCGTTAATC 


AATGCAATTA 


ATGATGAGCG 


AGAGCATTTG 


TCTCAAATTA 


GAAGTATAGC 


11460 




TAATTTTGTT 


ATAGATACTA 


CAAAGTTATC 


ACCTAAAGAA 


TTAAAAGAAC 


GCATTCGTCG 


11520 


25 


ATACTATGAA 


GATGAAGAGT 


TTGAAACTTT 


TACAATTAAT 


GTCACAAGTT 


TCGGTTTTAA 


11580 




ACATGGGATT 


CAGATGGATG 


CAGATTTAGT 


ATTTGATGTA 


CGATTTTTAC 


CAAATCCATA 


11640 




TTATGTAGTA 


GATTTAAGAC 


CTTTAACAGG ATTAGATAAA GACGTTTATA ATTATGTTAT 


11700 


30 


GAAATGGAAA 


GAGACGGAGA 


TTTTCTTTGA AAAATTAACT 


GATTTGTTAG 


ATTITATGAT 


11760 




ACCCGGGTAT 


AAAAAAGAAG 


GGAAATCTCA 


ATTAGTAATT 


GCCATCGGTT 


GTACGGGTGG 


11820 


35 


ACAACATCGA 


TCTGTAGCAT 


TAGCAGAACG 


ACTAGGTAAT 


TATCTAAATG 


AAGTATTTGA 


11880 


A7ATAATGTT 


TATGTGCATC 


ATAGGGACGC 


ACATATTGAA AGTGGCGAGA 


AAAAATGAGA 


11940 




CAAATAAAAG 


TTGTACTTAT 


CGGTGGTGGC 


ACTGGCTTAT 


CAGTTATGGC 


TAGGGGATTA 


12000 


tv 


AGAGAATTCC 


CAATTGATAT 


TACGGCGATT 


GTAACAGTTG 


CTGATAATGG 


TGGGAGTACA 


12060 




GGGAAAATCa 


GAGA7GAAAT 


GGATATACCA 


GCACCAGGAG 


ACATCAGAAA 


TGTGATTGCA 


12120 




GCTTTAAGTG 


ATTCTGAGTC 


AGTTTTAAGC 


CAACTTTTTC 


AGTATCGCTT 


TGAAGAAAAT 


12180 


a r 
¥0 


CAAATTAGCG 


GTCACTCATT 


AGGTAATTTA 


TTAATCGCAG 


GTATGACTAA 


TATTACGAAT 


12240 




GATTTCGGAC 


ATGCCATTAA 


AGCATTAAGT 


AAAATTTTAA 


ATATTAAAGG 


TAGAGTCATT 


12300 




CCATCTACAA 


ATACAAGTGT 


GCAATTAAAT 


GCTGTTATGG 


AAGATGGAGA 


AATTGTTTTT 


12360 


SO 


GGAGAAACAA 


ATATTCCTAA 


AAAACATAAA 


AAAATTGATC 


GTGTGTTTTT 


AGAACCTAAC 


12420 




GATGTGCAAC 


CAATGGAAGA 


AGCAATCGAT 


GCriTAAGGG 


AAGCAGATTT 


AATCGTTCTT 


12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 12600 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12780 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 1284 0 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 1290 0 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGA7ATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 

AGACGT 13086 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



00 



35 



40 



45 
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Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 
TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 
ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 
GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 
AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 
CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 
CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 
ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 
CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 
GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 
TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 
TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 
GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
60C 
660 
720 
780 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 1020 

TGTTGATAAA TAATTAAAGC GG ATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 10 BO 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 1140 

10 

CGCTGCCTTT TGACCATCAT ATCTTACAGC 7ATTGGTAAG AAATGGAACA TTAAGTTAGG 1200 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 1320 

15 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2) INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

TAATGCTATT GGCAACACCA TATATGAAAn CTCCAAACGA TCCTAAACCG ACTATAGATT 60 

30 

CACCAAATTT nACAATCCAT GAATAAAGTA GTGGCCATAA GAATAACAAT ATGACAACTA 120 

AAAATGTACA GTAAAATGCA GTCATAATTG GAACTAGACG TTTACCACTA AAAAATGATA 180 

ATGCTAATGG TAATTCTGTT TCACTAAACT TATTGTATGC ATAAGCTGCT ATTAAACCTA 24 0 

35 

TTACAATACC AACAAAGACA TTGCCATTAT TCATCTTTTC AAAAGCTGAA TTTATTTCCG 300 

ArGCTTTCAT TCCTAATAAA GGCGCTAATT TCATTGGTGA TAATACAACT GTAACTAAAA 3 50 

AATATCCTAA CGTrGCTGCA rGCGsGACTG CACCATCATT TTTCTTTGCC ATTCCTATAG 420 

40 

CTACACCAAT TGCAAATAAA ATACCTAATT GCTCTAAAAT CGTAGTACCT ACCGTAGTAA 430 

AGAACATTGC GATTTTCGGC GTCGCATGAA GTGCATTTAA CGTATTACCA ATTCCGGCAA 54 0 

45 TAATTGCTGC AGCCGGTAAA ATGGCAACTG GTAACATTAA CGAACGCCCT AAATTTTGGA 600 

AAAATTTATA CATTGAATGT CATCCTTCTT AAAATAATGT AGAAATATAA AGATTACTAA 660 

TGTAACTAGA ATAACTACTT CGATACTCCG TTATAGTCAC CTAGGCTTAC TAACCAGCTA 720 

50 TATTTCTACC TCAAGTTATT TTATAAACTT TTTACAATTT CATGCAATTC TTGTTGTAAC 730 

TTTGCTGTTC GTGTTTCAAT CTCTTTTGTA ATATAATCGA TACGCTCGTT TCGTTTTAAA 84 0 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 960 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

5 CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 1080 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 1140 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 1200 

TO 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 1260 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 1320 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 

15 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 
(3) TYPE: nucleic acid 
20 (C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

25 . (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 60 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 120 

30 ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 130 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 240 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 3 00 

35 TACTATATGT ATAGGGGTCA AGCCATGAAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 360 

CTAATAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 30 

40 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 540 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 600 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

45 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 780 

$0 ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 340 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC X020 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCCTTT TATACTCACT AATGTTTATA 1080 

TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AG CT AT AG AT 1140 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 1200 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 1320 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 1380 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTACCAATG 1500 

ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGTCCTAA TGGTGAAGAT GGCACGATTC 1680 

AAGGGCTTTT TGAAGTTTTG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 1740 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC 1800 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG I8 60 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

30 GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTGACCGTAA GCTTGTTATA GAACAAGGCG TTAACGCACG TGAAATTGAA GTAGCAGTTT 2 04 0 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

ACGATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 2280 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2340 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC 2400 

AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT 2460 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 2520 

AGAGATAAAT GGAGTCACAA TTGATTCACG AGCAATTTCT AAAAATATGT TATTTATACC 2580 

50 ATTTAAAGGT GAAAATGTTG ACGGTCATCG CTTTGTCTCT AAAGGATTAC AAGATGGTGC 264 0 

TGGGGCTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GGCCTATTAT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC 
TGAAAGTGTA TTGCATACCG AATTTAAAGT 
5 AATTGGTTTA CCTTTAACTA TTTTGGAATT 

GATGGGGATG TCAGGTTTCC ATGAAATTGA 
TGCAGTTATA ACTAATATTG GTGAGTCACA 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG 
TGGCGATGAA CCATTATTGA AACCACATGT 
TATTGGTGTT GCTACTGATA ATGCATTAGT 

15 

TATTTCATTT ACGATTAA7A ATAAAGAACA 
TATGAAAAAT GCGACGATTG CCATTGCGGT 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT 

AGAAAATGAT ATTACTGTGA TAAATGATGC 
AGCTATTGAT ACACTGAGTA CTTTGACAGG 

25 AGAATTAGGT GAAAATAGCA AAGAAATGCA 
GCATATAGAT GTGTTGTATA CGTTTGGTAA 
GCAACATGTC GAAAAAGCAC AACACTTCAA 

30 AAACGATTTA AAAGCGCATG ACCGTGTATT 
AGAAGTGGTA AATGCTTTAA TTTCATAGAG 

TGATTTGAAT TAATACTAAA AGATTACAAA 

35 

TTGCCTTTTT CTTTTTATGT TAAATCTATA 

GTACACACTT TATATAGGAA GTAGTTTGAA 

■ 

GTATTATAAT GTCTAATTTC ACATGTGTTT 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT 
CTTATCTAAT GCTAGCTTTT TGACAAAAAT 

4S TATTCGCAAA TTGCTTTATT GCGATTAAAT 
AAATATTAAT GAACTTATAT GCAAAAGTAT 
TATTTTGCAA AATTTTAAAG AACTAGGGAT 

50 AATGGGATTT AAAGAGCCGA CACCTATCCA 
AATTGATATC CTTGGGCAAG CTCAAACCGG 
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TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TAAGAAAACG CAAGGTAATT ACAATAATGA 2880 

AGATAATGAT ACTGAAATAT CAATATTGGA 2940 

ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TAAAGAAGTT GAAAATGCAA AATGTATTAG 3130 

TTGTTCTGTT GATGATAGAG ATACTACAGG 3240 

TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

AACTGGTATG CGTATGGAAC AACATACATT 3420 

CTATAATGCA AGTCCTACAA GTATGAGAGC 34 BO 

GCGTCGCATT CTAATTTTAG GAGATGTTTT 3540 

TATCGGTGTA GGTAATTATT TAGAAGAAAA 3600 

TGAAGCGAAG TATATTTATG ATTCGGGCCA 36 60 

TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

AGTTAAAGGA TCACGTGGTA TGAAATTAGA 3780 

ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3840 

GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3900 

AATTTGAAAC TAAATCAAGG TTAATTCTAT 3960 

TGTTTATATA ATGTTTTACA AAAAGATGTA 4020 

CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

GATTATTCGA ATGATGCGTA ACGCTTACAT 4140 

ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4320 

TTCGGATAAT ACGGTTCAGT CACTTGAATC 43 80 

AAAAGACAGT ATCCCTTATG CGTTACAAGG 4440 

TACAGGTAAA ACAGGAGCAT T CGGTATT C C 4500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 462.0 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4680 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 474 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4860 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4920 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 510 0 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

20 TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5230 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 5400 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CA7CGTAAAG AAGTACTTCA AGCACGTGAA 5460 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 552 0 

CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTTAGTTGC TGCACTTTTA 5580 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 564 0 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGACCGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 53 BO 

GCTC TTTT T T GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 594 0 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

50 TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 

TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 

CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 

CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 

ACTTACTGTC AATGTGTATA AACCGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 

TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10470 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn 
CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa 
AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG 
ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA 
TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT 
ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT 



CGATTAAAAT 
CTGACTCATC 
CTGTTAACGT 
TGGCAAATGT 
CTGTATCTCC 
CTTCTGGTTT 
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* 

ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 600 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTGATCT AACAATCCAA 720 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 1020 

20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 1080 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 1140 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 1200 

25 TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 1260 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 1380 

30 GAATCTCTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 1440 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGTTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 1680 

* 

CATATTGCGC 7ACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 174 0 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 2040 

50 CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 



35 



40 



45 



55 



272 



EP0 786 519 A2 



GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2340 

5 TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 2400 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

10 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

15 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2820 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2880 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3060 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAAC CTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 3360 

35 

CCACATAGGG AAATCA7AAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 3420 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 354 0 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC '3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

45 TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 3780 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 384 0 

50 CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 4140 

ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4320 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 4440 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4500 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4560 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4620 

CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 4 680 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4740 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4800 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4860 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4920 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4980 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 504 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

CGTTTTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 5280 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 5340 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 54 00 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 54 60 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 5520 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG ATAACCTTTT 5580 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 5640 

50 TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

GATCGATACG ACCTTGTTTG TCATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 5760 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 5880 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5940 

5 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6 000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

10 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

■ AGTTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6240 

TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

15 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 6480 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 6540 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 6660 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 6720 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

30 ACGCCAAATG CTGTCG7CTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 6840 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT CATTTTTACT 6900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

35 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 7140 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 7200 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 7260 

TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 7320 

45 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGG ACTTC CTGTTGCGTT 7330 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 7440 

50 AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7560 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 7680 

AGCGATATAT AACGCCCATT CAACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7740 

AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATCCTAAA 7860 

ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7920 

TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CATCTTCTAC CTTGTCGCGC TTTAATTCTT 804 0 

CAAAATTTCT ATCTAATTTG TCATAAATCT TTTCTTGCGC TCTAAGAC7A TCTTCTATTC 8100 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 8160 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

ACAAGCATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT C CAGTGATT A 8280 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 8340 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 8400 

25 TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 8460 

TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 8520 

TTTGCATACT CGCAACCATT CCGCGAAGTT CCTCATCACT TAAATCTGAC GCACTTTGTT 8580 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA ATTTCGCCGT 8640 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 8700 

CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 8760 

35 TATCGTTTAC TGTGATTTTC ATTATTTCCA CCCCATAATT TTAGTTATAG TAACTTTGTT 8820 

GGCAJTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 8880 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 8 94 0 

TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 9240 

AACGAATACC GATAAATAAC CCTCATAACT TTCAACGCTA CCTGGTAAAT CCGGCACTCT 9300 

TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 9360 
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30 
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TCTACATACT 


GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 


9480 


GTTAAGTTTC 


CATCATTCTT 7TTA7AAAAC 


GGGTACCATG 


TGCCGTAGAT 


9540 


GTGTACTCAT 


CGTTTGAATC GTCTGGGTAC 


CATGTTGCAC 


GAGCAGTATT 


9600 


ACATAAACAA 


CTAACACACC AGATTTGCTT 


GATGTATAAG 


TTGATTCATC 


9660 


CCGTCATCAA 


CACCATCTTG TCCAGGCTTC 


TCTAACGTGC 


CTATATCCGT 


9720 


GCATCTGTTG 


CATTAGTAAT ATGAATAATC 


CTAGATGTGT 


TAACTGCGCT 


9780 


TCTATGGACT 


GCTCATACGA TTCAATTGCT 


TTACCGTAAT 


CATCTGTAAG 


9840 


TGCCAATTCG 


TTGTTGAATT ACCTTTAACA AGGTCAGCGC 


CATTGATTTG 


9900 


TCGTTAACAC 


GTTCAAAAAT CGCTTGCTCT 


TTTTCAACTA 


TTTTATCGAA 


9960 


ACAGCTTGTG 


TTGCACTAGT TTGCGTCGCA 


GTAATAGCTT 


GTATAGCTTC 


10020 


ATTTCGATTT 


GTTGAATGCC TTTTGTCGCA 


CTATCATTCA 


CTTTTGCTAT 


10080 


GTATCAGCCA 


TATTTTGCTT TAATTGGTTA AAATCTTTAC 


CGACAGCTTC 


10140 


TGAATAGATT 


TGATATAAAC AAGCTTTGTT 


ATACCATCAA 


ACCCACTAAC 


10200 


TCAATATTGA 


AG CTAAATTG ACGTTCAACA 


ACAACATTAT 


TACTCCCGTT 


10260 

V A* <J w 


AATGCCTGAG 


CATGCACCTT GCCTGAATGT 


TTTAAAAATT 


CATTCGGTAT 


10320 


AAACGCCCAT 


TAATTGCGTC TACTATCGTT AATTCGTCTG 


AAATATAAGC 


10380 


ACGTTATAAT 


CATCGGTTTT TAAnACGATA 


GATGTTTTAA 


CATGTTCAGA 


10440 


AAGGGTCTGT 


TATnCTTAGT 






10470 



(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 
Z (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 
CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 
AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 
AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 
TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 
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180 
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300 
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TCAAATTGTA ACAACTAATC CTATTGCAGG 
AGATAATGAG AATATGAAAC AACTACTTAA 
5 GCTAGTTGAT TTAGGACGTA ATGATATTCA 
TACTAAATTA ATGGTTATTG AAAAATATGA 
AGGTAAAATA AATCAAAATT TATCGCCAAT 

10 

TACCGTTTCA GGTGCACCAA AATTACGTGC 
TAAACGGGGC GTTTATAGTG GTGGTGTTGG 
TGCATTAGCA ATTCGAACGA TGATGATAGA 

15 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA 
AAGCTTATTG GAGGTGAGCC CATGATCTTA 
AACCTAGTGG ATATTGTTGC TCAACATACT 

20 

AATGTGCTGA ATCAATCGGT GGACGCTGTT 
GACGATCAAC AGTTAATGAA AATCATATCA 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC 
GTTATGCACG GCAAAGTTGA TACACTAAAG 
CAAGATATAC CAGAACAGTT TTCAATTATG 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA 
CATAAAGAAA GACCGCATTA TGGTATTCAG 
GGTGTCAAAA TAATTACAAA TTTCATTAAT 

35 TACTAACAAG AATAAAAACT GAAACTATAT 
ATATACTTAT TTCTCCTAGT ATTGGAACTG 
CGGAGCGAGA AATCCAACAA CAAGAATTAA 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG 
AGTCAAATAG TTTCAACATT TCAACGACTG 
AAGTTATAAA ACATGGtAAT AAAAGTATTA 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG 
ACCTTGTATT CAT7GGTGCA aCTGAATCAT 
50 GAAAAATGAT TGGAAAGCCT ACAATATTAA 
ACTTAACGTA TCAAATGGTA GGCGTCTTTG 



TACGATTCAA CGTGGTGAGA CGACACAAAT 42 0 

TGATCCAAAA GAATGCAGCG AACATCGTAT 4 80 

TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 540 

ACATGTTATG CATATCGTAA GTGAAGTCAC 600 

GACAGTTATT GCGAATTTAT TACCAACAGG 660 

AATTGAAAGA ATATATGAAC AATATCCACA 720 

ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 840 

AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

GTTGTAGATA ATTATGATTC CTTTACATAT 960 

GACGTCATTG TTCAATACCC TGATGATGAT 1020 

ATTATATCTC CTGGTCCAGG GCATCCATTA 1080 

ACCTATCAAC ACAAACCCAT TTTAGGTATT 114 0 

TACGGTGGAG AAGTCATTAA AGGCGACAAG 1200 

GTTATATCGC ATCATCAACA TCTGTTATAT 1260 

AGATATCATT CATTAATAAG TAACCCTGAC 1320 

CGTACCAAAG ATTGTATACA GTCATTCGAG 1380 

TACCATCCTG AATCATTTGC TACAGACTAT 1440 

CTAGTGAAGG AAGGATGAAA ACCATGACAT 1500 

TACTTGAAAG CGACATTAAA GAGCTAATCG 1560 

ATATTAAATA TGAATTACTT AGTTCCTATT 1620 

CATATATTGT ACGTAGCTTA ATTAATACAA 16 80 

CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

CCTCaAATTC aGGTAGTACG GATTTGtTAA 18 5 0 

ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ATCCAATCAT GAAGTATATG CAACCAGTTA 1930 

ACCTTGTGGG TCCATTAATT AATCCATATC 2040 

ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 22 20 

ATTACACATT AAATG CGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 2340 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 2460 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2580 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 2640 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATT A 2B20 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2B80 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2940 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3000 

TTAGAACGTG CCTATAAGGT 7AATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTGACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3240 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 33 00 

CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 33 60 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 80 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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GACGCACCAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT 1850 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 19 BO 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 204 0 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 2100 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 2160 

15 GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 234 0 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 2400 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 2640 

CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGGTAAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2320 

35 TCGTAATACC ACTT7CTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 2940 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3000 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GG ATTTT GCG 3120 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 3180 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 3240 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 3300 

50 TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3 360 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA TACCACGAAA 3420 

ACCAAAAATT TTACGTTTGC CAGCATATTC AACTAATTGA ACTTCTTTTT TATCCCCAGG 34 80 
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TTCGAAATCT AATGCTGCAT 


TTGCTTCATA 


AAAATGAAAA 


TGTGAGCCCA 


CTTGAATTGG 


3600 




TCGATCTCCT 


GTATTTTCAA 


CTTCGATAAC 


TGTTTCAGGA 


TGATGGTTAT 


TAATTTCAAC 


3660 


5 


CTCTGTACTT 


TTTGTAATAA 


TTTCTCCTGG 


TATCATTTGA 


CTGCCTCCTT 


TAAACAATAG 


3720 




GGTGATGTAC 


TGTGATTAAC 


TTAGTACCAT 


CGGGGAACGT 


AGCCTCGATT 


TCGATATCTG 


3780 




TAATCATGTG 


TTCGACACCA 


TCCATGACAT 


CTTCTTTGTT 


TAGAATTTGT 


CTACCATAAC 


3840 




TCATTAACTC 


TGCAACGGTC 


TTACCATCGC 


GTGCACCTTC 


TAATAATTCA 


TCGCTGATTA 


3900 




AAGCTAATGC 


CTCAGGATGA 


TTTAGTTTCA AACCACGTGC 


TTTACGACGA 


CGTGCAACTT 


3960 


15 


CCGCCGCCAC 


TACAATCATT 


AATTTGTCTT 


GCTCTCGTTG 


TGTAAAATGC 


AAATTAAAAC 


4020 




CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 


4080 




ACAAGTAATA 


TACAAAGTTC 


AATGTGTAAT 


TAGAAAATTA 


TATTTTTAGC 


ATATCCGATA 


4140 


20 


TTGAAGCAAA 


CAATCTAATC 


GAAAACAAAT 


AGTGGAATAT 


ATTTATGTAA 


AAACCAAAAT 


4200 




AGTTTTTAAT 


ATAACTTTTC 


ATAGAATAGT 


AGTATATTAA 


TGAGTAATGA 


TTCAAAGGAA 


4260 


25 


AGGTGAAAGA 


TTTGAAGATA 


ATAGATGTGC 


TTTTGAAAAA 


TATATCTCAG 


GTTGTGTTAA 


4320 


TTAGTAATAA 


ATGGACAGGA 


TTATTTATCT 


TAATAGGATT 


ATTTGTAGCC 


GATTGGACAA 


4380 




TTGGATTAGC 


GGCTATTGTA 


GGTAGCATCA 


TCGCCTATAC 


TTTTGCGCGT 


TTTATAAATT 


4440 


30 


ATAGTGAGGC 


AGAGATTAAT 


GATGGGTTAG 


CTGGATTTAA 


TCCAGTGCTA 


ACTGCCATTG 


4500 




CGTTAACAAT 


CTTTTTAGAT 


AAGTCAGGAT 


TAGATATTGT 


TATAACAATG 


ATAGCAACTT 


4560 




TATTAACGTT 


ACCAGTTGCT 


GCTGCAGTGA 


GAGAAGTTTT 


AAGACCATAT 


AAAGTTCCGA 


4620 


35 


TGCTGACGAT 


GCCTTTTGTC 


ATTGTGACTT 


GGTTTACAAT 


TTTACTTTCA 


GGACAGGTTA 


4680 




AATTTGTAGA 


TACATCGTTA 


AAGTTAATGC 


CTCAAAACAT 


TGAAACGGTT 


AATTTTAGCA 


4740 


40 


ACAATGATAG 


AATaCATTTC 


ATTCAGTCAT 


TATTTGAAGG 


ATTCAGTCAA 


GTATTTATCG 


4800 


AAGCGAGTGT 


AATTGGTGGC 


GTATGTATTT 


TAATCGGCAT 


ATTGATAGCA 


TCAAGAAAAG 


4860 




CAACACTCTT 


AGCTGTTATA 


GCTAGTTTGT 


TAAGCTTl'AT 


CATTGTAGCT 


CTATTAGGTG 


4920 


45 


GTAATTATGA 


TGATATTAAT 


CAGGGATTAT 


TCGGTTATAA 


CTTTGTATTA 


ATGGCAATCG 


4980 




CACTAGGATA 


TACATTTAAA ACAGCGATTA 


ACCCTTATAT 


TTCGACTTTT 


TTAGGTGTGT 






TATTAACAGT 


AGTGGTGCAA 


CTAGGTACAA 


CAACATTGCT 


TGAACCGTTT 


GGCTTACCTG 


5100 


50 


CATTAACATT 


GCCATTTATT 


ATCGTGACAT 


GGA'ITTTATT 


ATTTGCTGGT 


ATTAAACATG 


5160 




ACAAAGTAGA 


TGCTTGATAG 


TTAAATCAAA 


CCTAATATTG 


TTTGAATATC 


ACCTTAAACT 


5220 




ATACAGCGAA 


TTGTATAGTT 


TAAGGTGTAT 


TTTTATGGAT 


AAAATTAAGT 


GCATACTTAA 


5280 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 
ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 
AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 
CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 
CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 
AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 
ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 
GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 
GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 
ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 
AAAGTGTTGG TTTTTTCTTA GTAGAC 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: S2Q ID NO: 23: 
CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 
ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 
TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TACCTAACTT 
AAAG5TGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 
AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 
AATCATACGA TATGTATACA AAATAATGA.T, AAACTGTmAA AAATGATTTG CCTTTAATAA 
ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 
AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 
CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 
TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 
TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 
TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 
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TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 840 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 900 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 960 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 1080 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TTAGCGTTTG CTTAACAGCG GGAACATCTG 1140 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACGCTAACT 1200 

15 GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCG 1380 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 1440 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGACCTACT AAATGTTCGA 1500 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAATCAT TGGTCCATGA TTGCCTACAA 1560 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

AATGCTATAC TTTCAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 1740 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGTCAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2040 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACGTTCCGGG CGTGGTCCAA TAAAACTCAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

TAATTCATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

TTTATCAGCC CATTG CGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 2340 

50 CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 2400 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 2460 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 264 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2700 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2760 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2820 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2880 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2940 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3000 

15 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 3300 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT A7AGTTATCT AGAACATAAA 3420 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 3480 

TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3540 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3600 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3340 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3 900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4 020 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4 080 

AT7CACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 4140 

SO TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4320 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC 
TGCTTTGATA GCAATAGCTT TTGTACCATC 
5 CCCTTCTTCA TTGAAATCAA CAACTGCTAC 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC 
AAAGATCACT CCTCAAATTT CTTTCCTTTA 

10 

TACAACAAAG GTAGCTCCAT TTAACAAAAT 
CTACCATTAG TGATTGGCAA TGCGTTTAAA 

1$ TATTTTTTTA AGTCTCTCGA TTAGTTTGTC 
ATTCAATAGG CGGTTCCGTG TTATCACTGA 
TTTCTTCGTT AAATCCTTCA AGGTTTTTAG 

20 TCATGTCTTG ACTATCAAGT TCCTTTTTAC 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT 
AAATAGATAT GGCTGAAACA AACCAGAGTA 

25 

CGATAATAGC CAATACAATT AATATGACAC 
ATGCCAATAC GATGACAGGT ACGATTGAAA 
ATATAACTAT TGTTACTATT AAATAATCAG 

30 

CGAAAATTAG TCCATAGCAA ATTACAAACC 
AAGCTATTTG AAAATATAAA CCTATCTTTA 
35 CTATTCCCCA TTTATTTAAA ATTTATACTT 

ATTTTATCTT TAGATTCAAA TTGATTCTCT 
TCTGTCGATT CATCTTTTGA GTATTTATTC 

* 

40 AAAATTGACG AAAGGAAATT ATATAAACAC 
GCACCTCCGA TTACAGAGTA ACTTTCCATA 
GTGAAAAGAG CCAATATTAA TCCTAATAAA 

45 

ATATTAATAG ATATCATCCT AACAAAAACG 
ATCCAAATTG CTATTTTTCC TATAATTGAG 
SO TATACTTTAC CTTAATATAC CTTATTTTAT 
AGAACTTCAA TATTTATAAA ATATCAAAAG 
TCGCTATCAA TACGCTAAAT CATCATATTT 
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GTTAAATACA TCATCACGGT TTGATACATC 4440 

ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

TTTGAAACCA TCTTCCACTA AACGTTCTGC 4560 

AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4620 

ATTACATTTT ACTCCTCTTC ATTTGAATAG 4680 

ATTCAGATAT TTAAGGTATA GTTAAACGCA 474 0 

TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4800 

ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4860 

CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4920 

TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 980 

ACGTGTCTTT ATGTGATGCT TGATTTGCGT 5040 

CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

GTATAATTAC AAATATAGAA ATTATTGCCG 5280 

CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

CACATAAAGT TATAGCCATG AGTACTATAT 5400 

TGAATGATTT TTCTACATTT TTTTCCATGT 54 60 

TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 5520 

GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

CAAATCAGCA AAATACCACC AATCAGCCAT 5640 

AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

TAAATCGCAG TAAAGATGGT TGGTAAAACA 5760 

AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5320 

ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

CTCATACTCA TTCCCCATTT ATTTAAAATT 5940 

TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

TTCTCCACAC TATATTGTTT TATTATATTT 6060 

CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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TTCCAATTGC GCAGTTGTTC AACATCATCA 
AGATTAAGAC ATCGTCCTGA AATATTAAAG 

5 

TTATGAACAA CCGCTTCAAT TTCCTTATAA 
TGTTGAGAAA GACAAGGATA TGTACCTTGT 
TAACTTGCGA CAACCTTTTC CCATACTTGA 

10 

AAATATTGTT CTGTATCACC ATGACACATT 
GTAGTCCATG GCAAGCGATG TTCTTGTTGT 

15 TTATGTTGCC ATGTACTAAT TGAATATTGT 

TTACATCCTA ACGCTTTCAA ACTTGTATAC 
CCATGTTGCA TCGCTGTCAC TAAAATAGGA 

20 CTTTTCGTTT TTTCCAATCT TAAAGGTTCG 

GGTACCAATT TTAAATGTTC ATGAATATGA 
GTTAAATAAA TAAATTCAGG ATGTGGATGG 

25 

CCGTATGCAC CTGCATATTT GAAAACAATA 
ACTTCTCTAG CAAAGACATC TTTCGGTGTA 
CTCGAAATTG AAACTTTTTC AAATGAATAT 

30 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT 

GCATACCAAG CACCATGTAC TTTCTTAATG 

35, TGTGCCACAA TAAAGCGCCC ACATTCAAAG 

ACGATAAGTG TTTTAAAACG TTCTACAAAA 

GCATAGTTAA CGCCTATGCC ACCACCAAGA 
40 

TCAGACCATG CCTTTGCTTT TTTAAAATAA 
TCTAAATTGT TAGAAATAGA ATGAAAATGA 
AGCGCAgcTT CAATGACATC ATCAACTTCG 

45 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT 
TGTTGTGTCT TATCTTCATC TTCTAAGATG 
50 TCAACATGAA TACGCTGAAC ACCTTCACTT 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT 
CCTTGAGATG CAACTTCGAA TCCTTCAACA 

55 



TCTTGTTTAA GTAATGCCAG TGGTACTTGA 624 0 

CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 6300 

CTCAATGGCT GATACTTCAT GAGTACATCT 6360 

GCAATTCTCT CTACAGAACA ACAACCACTA 6420 

AAATGTGCTT CGCCTAAATC TTTTGTATAC 64 80 

GTAATAAATG GCGCTTCTTG TCTTGTCTCA 6540 

AACGTTTCCC ACCACACACC AAATGGAACT 6600 

GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

CGATGCACAC CATCTATAAC CATATATCTA 6720 

TGACGTATAA AATCATCTGC TTCAATACTA 6780 

AATGTTTCGT GAAGATCAAT CTTATCTACT 684 0 

TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6900 

CTTAAGAAAT CGTGATGTGA AATAGACCAT 6960 

ACGTCGCCTG TACTGATTGC GTCTATCTGT 7020 

CATAATTGAC CGACTAACGT TGTGTCCTGT 7080 

GGATTGTCCT TATAGCGATA AATGTCAAAA 714 0 

CTAAATTGTT GCGTACCTCC TCTTAATATG 7200 

TCTAGCACTT CTGTCACATA GTAACCAATA 7260 

TTCAATGTCA CATCTTCCAT TTCTTGCTCA 73 2 0 

TTATCCCATT CAAATTGGTT AGTTAAATCT 7380 

TTGATATGTT TGAGTGGAAA TCGATGTTTT 744 0 

AGTTTCACTA CATCGACATG TAAATTCGAG 75 00 

AATCCATCTA GATGAATCTT TGGCATTGCG 7560 

TCTTCAGAAA TACCAAATTG TGTTGGGCGT 7620 

GGTCCTGCTA AATTAACACG CAATAAAATG 7680 

GCATTTAGCC GTTGTAATTC ATGCATACTT 7740 

ACTGCATATC TTAGTTCCTC GTCTGTCTTA 78 00 

GG7TTAAAAG CAAGACCTTT TGCTATTTCA 786 0 

TACTGACTAA TTGTATCTAG GATTTTTCGT 7920 
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TGTTGCAAAT GATGTTCCAG TCCGACTAAA TCATAGATAT AATGACAAAC TGGATGAGAT 8040 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 822 0 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 8340 

CTTCGACTAT CATGTCTAAA CCTTCGACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 8400 

15 CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 8460 

GTAATGGTGT ACGTCCAAAT CTTGCCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 8520 

GGTAATAACG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGATCTTTA ACGTGTGGCA 8530 

20 CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ATCTCTATCT ATCACTGCAG 8640 

TGACACGTAC AATTGGTATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

TCTCATCATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAGTCATAC CCGCTTGCTG 8760 

CGATAACCTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATGCCATAA ACTGAATCAC 8820 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 8380 

GGATTTGTAA CATGATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 8940 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GATAGTTATT CAACTCTGGA 9000 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

35 ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT CAATAAATGA ATTACTATTC 9180 

ACTTTTTTAT GTGCTTCTGG CATTGGCTTT AATGTCAGGT GTGAAGCAGC TTCACTTAAA 9240 

TGCtCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA AGGCAATACG TGTAGGCCAA 9300 

CCATTTTCAT GAATGAGCA7 CATATTTTGT GCATGCGATT CAAAGGCAAT ACCGTGATAA 9360 

TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 9420 

GAACCATATT GTTTAATCCA ATTTTCAATG AATGGTACAC CATCCTTATC ACTTGCATAA 9480 

AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA TATGATATAT ATTTTCACGC 954 0 

50 CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9600 

AAATAAGACT GTCCTAAGAC TTCCCCTAGA AAAACTGTCT TTAATTCATC TTTTAAATAC 9660 

ATATCTTGTT GCTGTATCTG CTTTAACCAA TCCGTAATTT GCGCTGCATT TTCAATTGTA 9720 
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TATTTTGTCG TGTCTATTGG CGACATCGTA CGAATCGATT GTTGAGGGTG ATATAGCTCA 9840 

TCACTTTCCC CTAACCATAG TACTGTGCCA TTAAGCCTTT CTTCAGCCAA ATCAACTTGG 9900 

ATGACATGTT CAAACTGCCA TGGGTGTACA GGTATCATCT CAACATCATT TACATGTTTG 9960 

CCAGATGCTT CAATTTGCTG TACAAAATGT TCATAAGTCT TATCGCCAAC TTGTTGACGT 10020 

AACATTTCGT TAACTACAAC ATTTCTTGAT ACCGTCGTTT CTACTTTATC TTTGTCGATA 10080 

GCTAACCACT GCAGTTTAAC GTTTGGTACA AAATCAGGAC CAAATTTCAA ATTATCACTC 10140 

AACGTAAATC CTAAACGTGA TTTGTAACTT GGATGATACT GATGCCCTTC CATCGCATAA 10200 

AATTCATAGT CGTTAAATGT CTCAGGTGTT GCTGGTGGGT TTGATTCT CG ATACTGCATA 10260 

CTTTGCGTAT CTTTTAATTC TGTCTGTAAT AACTCGACAA TAAATTGTTC TAGCTTTTCA 10320 

TCATTTTTAG GAAATGTAAA TACAACCTCT CTCAATAATT GTGTATAGTC TGTTGTTGTA 10380 

TCTGCCTCAT CTCCTACGAC ACGCTCAATT GGTGATGTGA TACGTATACG ATCAAAGCTA 10440 

TGTGTCTTTT CAGCAGTAAA ACGATACTCT GAATCATGTC CTTCTATTGT AAAATG AC CG 10500 

ACACCGTCTT GATATGACGC TTTATACACA ACAATATTCT CATAAATAAG TGATGATACC 10560 

AGTTGGTGCA TCACTCTAGT CTTTACACGA TTAAGAATTG TTTGATTCAC AATACGATAC 10620 

CTCCTTGTTA TGACAAATTG GATTTGGTAT ATGTGTATAA ATAGGGTTTG CACCACAATC 10 680 

ATTCAATTTA CTCATCAAAT TCGCTTTAGC CGcAATGGTC GGCGTTTGAT ATAAATCTTC 10740 

TACACAGTCA ACAAATACTG CGTTATTCGC GTATTCTTTT TTCCAAGTCA TAAGACGATG 10800 

CGCTACAAGT TGCCATAACA CAACTTCATT TCTAGTCGCT TTACCAATAG TTGATACTAA 10 860 

ATGTCCTAAG TGATTTACTA CAACGTAATA TTTAAGACGA TGCCATGCTT CATCATGTGC 10920 

ATATACAACA GGGCTTGATG CTGCCACAAC ATTTGGCACA AGCTGTTTTT CAGTAGCAAT 10980 

CGTTCTAGAT AGACAAATGC CTTCAAGATC TCTGACAAAG CATACGTCGG GTATGCCATC 11040 

TTTTAATTCA ATTAATGTAT TTTGTACATG TGCTTCTAGA CTAATGCCTG TGTTACTAAA 11100 

CAGCTTTAAT ATCGGCAATA ATGTACGATT CAAATAACAT TCAAGCCATG CTTCTGGTGC 11160 

TAAACCACTT TGCTCAATCA CTTGTGATAA CTTAGACATC GGTGAATCAG GCATCGTTTC 11220 

AAATAATGAC GCCAATACAT GAATATCTTT ATCAGCATGG TAATTCGGTA TCCCTTCACG 112 80 

AACAATCATG GCACTATTTG TTAATAAATC CATTTCAGGT TCAACTGTTT GCCCTAATGG 11340 

ATTCGGTAAC AATGCACGAT ATCCTTCTTC AAACATCAAT TTAAAATGGG GTGTTTCAAC 11400 

CTCATCTTTG ACTGATGCGA TAACTTGCGC GGCATCAATT GTCCGTTCAA TCTGTTCAAG 11460 

GTCATTCGTA CGTATAAAAT TAGTGATTTT AACGTGTATC GGTAATTTTA AATAAATGTT 11520 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT ATATTGCATA TACTGTGGAT GCTGTCGCAA 11640 

CACATTGATT TGATAAGGAT GTGTTGGTAA TAAAATAAAA TCTTTGGGTA TCTCTGATAT 11700 

ATCTATGTCT GCTAATTGAT ACAACACTTT CTCAACCTGA TCTTCTTTAC CTTCTACATA 11760 

GCGCGTGAGC AGAACATCTT GATGCACAGC TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TTCGGGTGCA TATTTCTCTA AATCTGCTTC TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11880 

ATGAAATGGA TGACCTAAGT ATAAAGATTG TTCTGAAACG ATATAACGAT CCTCTACGTA 11940 

GTCTATTGTG TTACTTTGCA AATAACGTGC CGTGCGATGA ATGCTATTAT CGATGTCAGA 12000 

CATAATTTGC GCCATATGTT GTTGCACTGC CGTTTGATTA TCTGCACTTT GAGCCATATG 12060 

TTGCAAAATA CGCGCAATTG CTTCTTTATA AGTTGTTATT TTTTTACTTT TTCCATCGAT 12120 

AAGCCATACC TCTGGATGAT ACATATGATG CCCCATCGCA GACCAATAGC GAAATTCACC 12180 

CGTTAAAGTT TCGAGCTCTG ATAATTGTAT AGACCATTGA TGATTTTGAG GTGGTACTTG 12240 

ATATAAATTT TCTTCTCTAA AATATTCATT TAAAATGCGT TCGATAGCCG CATACGCTGC 12300 

ATGTTGTATT AATTCTTTAT TTTGCACTTT TTTGTTTCAA CTCCCATAAT TTCATTAATG 12360 

TGTGATCGTT GATTTGATTA GTGATGGTTG AACAAATTAA AAATAAACTA CTTACTGCAA 12420 

ATACTACGCC CATAACGATA AACGTAGTAG CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

cACTaAGACT GCCAATAATT TGACCAACAA CTAACATACT GTTCGTCGTT CCAACAAATG 12540 

TGCCTTTAAG TTGTTGATGA CACGCATTCA CGACAACAAA CATGACACTT TGAATCAATG 12600 

CACTATATGT TAATCCTTGA AGTATTCTTG CAGCCATTAA AAACTCTATA TTCGTCGCTA 12 660 

AACCTTGCAG TATCGCACTA CAACCACATG CAATCGTGGC AAATATATAT ACTGATTTAA 12 720 

CATATGATTT ATCATTAAAG CGTCCCCATA AAGGCGCGCT TAATATCGAA GCCGTCCAAA 12780 

ATGCGGACTG TAAAAATCCA ATCACACTAC GGTCATCTAT CGCTGTATGA TTCACTGATG 12840 

AAGCAAGTGG TGATAATGCA GTTAGCATGC CATACATAGC AAAGTTTGCT AAAACGCCAA 12900 

CGATAATAAA TCGACATGTT TGTTGTGTGC ATAATAGACA TTGAAATGAA CGGCGAATAC 12 960 

CTTTATTAAT ATTTGGTGTT TGTGATTTTG GCATATGTGT CGTTTCAATC AATTTTAATG 13020 

CACCGAAAAT ACAGACAATA AAAGTAATAA CGGCAATACT CATCAGTAAC GCACTAAAAC 13080 

CTAATATCGA AGCTGTAACA CCGCCAATTA ATGGCCCCAC AAGAGACCCT GCGCTGACTG 13140 

AACTTTGCAG TCTTCCTAAT ACCTTTCCAC GATCTTCAGC TGGCGCCTCT GCACTCGCAA 13200 

ACGCACTTGA TGCATCAACA ACACCACCAA ATAGTCCCTG CAATAACCTC ACAAGTACAA 13260 

ACTGTAATGG TGTCGTACAC AATGCCATTA AAAATAAGCA TACCGCCAAA CCAAGTAACG 13320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13S00 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13630 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13800 

AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13980 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AG CTGTTTCA 14340 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 14400 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 14460 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14640 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14 320 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14380 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14940 

TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA GTAATCTCAC TATTCGATAC AATTCCGGCT 
ATATCTTCAA ATAATAATGC ATCAACTAAA TCTCTTAATA TTATCGCTTG TGCTGTATTG 
ACTGCTGTAT GATTCTGCAA TGTTCAGACA CCTCGCATTC TTAATATAGG TTCAATGTTG 
TCCCAATATT TTGTTGTTGT GCCTGTTGAT AAATAAAATA AGCACTTGAA ATATCTTCGA 
TAGCCATACC CATCGGATTA AGTAATATGA TCTCATCATC GTCTTCACGT CCTGGTATGT 
CACCTGTCAC AAGTTGTCCT AGTTCAGCAT GAAGAGCTTC TTTGCTGAAT TTACCTT CTA 
ACACCAATTG GTTAATAGTT TTCTTTTCTC GATTACATTG TGACCAGTCA TCTACTACGA 
CTTTGTCAGC TTTAATAAAG ACTTCTTTAT GCACATCCAT GATAGAAATG TTGCTAATAA 
ATGCACCCTT TTGTAACCAA TCATATTCAA TGTATGGTTG ATCCGTTACG GTACATGTAA 
TGACTACTTC ACCATTTGAT ACTGCTTCTT TAGCATTTTC TGTCGCAATA AAATTAATTT 
CCGGACGCTG TTGTTGCCAT CTATCAACAA AGCGTGCACA TGCTTCAGAG AATTGATCGT 
AAACAAACAC GCGTTCAATA TGATCGAATT GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 
CGATTAGCCC GCATCCAATG ATTGTTAAGT CTTTAAATCC TTTTTTAGCC AAATGCTTTG 
CTGCAATCAC TGAAACTGCT GCAGTACGCA TACTACTAAT TAAACTTGCT TCCATAACTG 
CAATTGGATA ATTCGTTTCT GGATCATTCA AAATAATGAC GCCACTTGCA CGCTCCATAT 
TACGTTTCGA TGGATTGTCG TGCTTACTAC CTATCCACTT AATACCTGAA ATTGCGTGTT 
CACCACCGAT ATGACTTGGC ATTGCAATAA TTCGATCTGC GATGTGTCCA TTTTCAGGAT 
CCtGTCTTAA ATACGGCTTA AGCGGTTGTA CAAAATCATT GTGCGCATGG GCTGTTAATG 
CTTCTGTTAA TGCGTCCACA TAAACTTGTG AATGATTACC TCCCGCTTGT TCAATATCTG 
ATCTATTTAA ATACAACATC TCTCTatTCa TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 
TTTTTCTAAC CATGTATCTG AATAAACTAA ATCTAAGTAA CGATCGCCTC GATCTGGTAA 
AATCGTGACA ATTGTTGCAC CTTCTTCAAT TGACGTTATC AACTGCTCAA TCGCTGCAAT 
AATCGAACCT GTTGAAcCTC CGGCAAATAT GCCTTCATAA TCAATCAGTT TTCGACAGCC 
CAAAGCAGAT TGATAATCAT CTACATGGAT CACTTGATTA ATTTCTGATC TATTCAATAT 
TTCGGGTACA CGACTAGCAC CGATACCAGG TAATTCTCTA TTAATAGGTT TGTCACCAAA 
AATGACTGAC CCTTTCGCAT CAACAGCAAC AATTTGTGCG TTTGGATGCA CTTCTTTTAT 
TTTTCTACTC ATACCCATAA TGCTACCTGT CGTGCTGACT GGCGCGACAA AATAATCTAT 
AGGTTGCTTA ATTGTTTCAA CAATCTCTGT GCCTGCACCA TGATAATGGG ATTGCCAATT 
TAACTCATTC GCATATTGA7 TAATCCAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 



15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 
AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 
ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 
TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 
CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAACCATAG GTGTTTGCCC 
TACAGAATCT AACAATGAAT CGTGCACATG 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 
TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 
TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 
TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 
TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 
AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 
AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 
TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 
ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 
CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 
AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 
AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 
ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 
TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 
TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 
CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 
AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 1140 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 1200 

AACATTAACC ATAAACGTTT TTA7CGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 1320 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAGCATTTCT 13 80 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 144 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 1500 

75 ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 1560 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 168 0 

GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 1740 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 1860 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2040 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

35 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCCCATTT 2220 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2340 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 2400 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 2460 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2530 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2640 

50 AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 2700 

ATTGTCATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 2760 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC 

5 

ATTGAATTTA CTGTTATCTG CATTGACGTC 

TAATTTAGCT TCTGTTTCAG CGATATCTTT 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC 

10 

TTGTAAATCT TGTATACTAG CATCTAATTT 
TAAAGACTTT TTAGCAACTT TGATAGTTTT 

75 AACATCTTTA GTTTGATCTG CTACTCGTTT 
AATTTGCCTT TTGAATTTGG CTACACTAGC 
CACATTAACA CCTCTCTTTC TATTGCTTAT 

20 TATTTTGTGG TTCGTATTCA TCACGTTCGC 
GCCGTTGGAT ATTTTCTTCA TAAGGCAATA 
CTTTAGGTTT ATTTTCTGTC CCAACATTTT 

25 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA 
AGTTATATTC TGTTAATGTC ATTTGCTCAA 
ACATACAAGT TATAACGATT CTGTCGTAAG 

30 

TTCCACTACT TCGACTAGGT TTCGGGTCAT 
CGAACCGAAT TCTTCTAGTC CGATATTTTC 
35. ATTAATAGTA ATTGCTTGTT TTTTTAAGTG 
CACAACCGGA TTTCCACTTT CTAAACCTAC 
AGCTTGTTCA ACTTTTAAAC CTAATCGGTT 

40 

ACTTAATTCT AATGACTT7C CGTTAATTTC 
ATTAATTTAA ACAAAATAAA mArGCTTAAC 
CGGTGGTGAA TCTACTTTAG GTTGTGGAAT 

45 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC 
AGGTAGTGTT GCAAATCCAC GTTGGAAACG 
so ATCAATACCG TTAGCTTCTG CTTTTAATTC 
CGCTTTAAAT TTAGCGGAAT CCCCATTTTT 
TTCATACAAT ACGCGATCTA CAACTGCATC 

55 



GTTAGCAACA CCATTGTCCA CGTCTATAAT 2 94 0 

GAGACTAGCT TTAGATACTT TTAACACTCG 3000 

AATATTGACA CGTTTCTTTT CTAATTCTGA 3060 

AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

AGCTTTTACA TTTTTGTTAC TAAAGGCATC 324 0 

TTGTAAATTT TTATCGTTAG CGTTTAATTC 3300 

AAATCTTTGC ACAGACTTAA CCGCACTATC 3360 

TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

TAAATTCTGC TATAACTTTA AAGAATTCAT 34 80 

TACTAAATCT TATATCTTTA CCTTCGTTAA 3540 

CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3600 

TAGTAGCTGC AGCATCACGA ATAGCAAACG 36 60 

GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

TAACGTTCAA ATCTGTAATA CCAAGTGTTG 3780 

TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 840 

AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 900 

TGCGATTTCA TCTAATGCTT CATCAATGTT 3 960 

AGATGTAGCT GCGATTAAAA cTTCGCCAAT 4 020 

AGGCAACATT GATACACCTT GACCGATAGA 4080 

ATCGATTTCT CTTAAAAATT TAAAACCAAA 4140 

TACATTCATA ACTTAAAATC TCCATTCATA 4200 

GCCCTATTTT TATACCTCTC TTGGTGCAAC 4260 

TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4320 

AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 43 80 

ACCATTCACT C CAT ATT CAT ATTCATATTC 4440 

AAATTTATTG TGGAAACCTT GGAAATATTT 4500 

GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4560 

TTCAATTTCA TCTGCAAAAT CGTCACCATA • 4620 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4740 

AAGCATTTTA GTAGCATCTA CTTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 800 

5 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4860 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4 920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

10 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 504 0 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

? 5 CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 5280 

20 ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 5400 

CTGATATTGC GTGATaAATT ACC 5423 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(S) TYPE: nucleic acid 

30 (C) STRANDEDNSSS : double 

(D) TOPOLOGY: linear 



35 (xi> SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 240 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 4 20 

so TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 4 80 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 540 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 
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10 



20 



25 



TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 72 0 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 840 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 960 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 1020 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 1080 

75 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 1140 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 1200 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 1320 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 1380 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 1440 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 1620 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 1680 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 1740 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 1860 

ATAAACGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 1920 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 2040 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2220 

so TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 2280 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2340 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 2400 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2580 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 2640 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

GCTTTAGTAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

15 GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3060 

20 GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 3300 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 3480 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3540 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATT ATT AG C A GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCA7A GTTTAGGTGA 3660 

ATATrCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 3840 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3900 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 3960 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4020 

SO TTACA7TAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 40 80 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 4140 

AG7ACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 380 

GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4440 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 462 0 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4 680 

15 TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 474 0 

ATCCGGGACA AGCAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 860 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 504 0 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 5340 

35 . TAAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5400 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 5460 

AAAAffTCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 5520 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5530 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 5640 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

TAGGCTTTAC T7ATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

50 CGGTATTAGA ATTGACGGTT TCACGATATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 58 80 

GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 5940 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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10 



ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 6120 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6180 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 624 0 

TATTCACTTC A 6251 
(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
T5 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: £ 
ACCTACTGAA GTTGCTAATT TTTTGGAGCA 
AGATAAAAAA CAACTTGAAA AAGTAATCGA 
AGACGTGgCA TCAATCTGTA AGTGaTGCTT 

25 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG 
ATcAAATGGT TGGTGACGCG GTAGAAAAAG 
TGAAACGTCA ATCAAAAGTA TTTAGATCGC 

30 

ACTTATTAAA AAACGAAGAT TGGGATTACT 
TGACGCTTGA AAATATTCAT CATTTGCATG 
35 CAAATGCACA AAATAATGCA TCAAATACAC 

AAACAACTAA GAAGTAAGAA TTAAATAAAG 
CAGCGAATTA GGTAATGGTG AGAGCCTAGT 

40 

TAATATTTAA ATAATGTAAT GAGAGAACTC 
GCAATCGTCC CTTTTAATTT AACTTAGAGT 
GATTACAAAG AAACGTTATT AATGCCTAAA 

45 

AACAAGGAAC CGCAAATTCA AGAAAAATGG 
GAAAAAAATA AAGGTAACGA AACATTCATT 
SO AACTTACATA TGGGACATGC CTTGAACAAA 

ACTATGCAAG GGTTCTATGC ACCATACGTA 
GAACAAGCAT TAACGAAAAA AGGTGTTGAC 

55 



!EQ ID NO: 26: 

ACTAAGCACT GAAATTGAAC GTCTTAAAGA 60 

AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

TGATACAAGC TCAAAAAGCT GGTGAAGAAA 1B0 

CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

CACGCCGTTT AGCATTCCAG ACTGAAGATA 300 

GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 360 

TGTTGAATTA TGATTTAGAC GCTGAACAAG 420 

AAAATGATTT AAAGCCAGAT GAAGTTGCAG 4 80 

CAGACAATAA TCAACAATCC AATGATTCAG 540 

ACAGACGCGT AATATACATT TAACTTTTCA 600 

AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

TTTTTAAATT TTTAAGGAGT GAAAAAAATG 780 

ACAGATTTCC CAATGCGAGG TGGTTTACCA 840 

GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

TTACATGATG GCCCACCATA CGCGAATGGT 960 

ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

CGAAAGAAAA TGTCAACAGC TGAATTCCGT 1140 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 1260 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 1560 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGGAC TGGaTTGGGA TAAAGCATCA 1620 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 1740 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 2040 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACSTGT GTGGGGTGTA 2160 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

35 GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 2340 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 2400 

AGAGCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 2460 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2 520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 264 0 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

SO TTCAATCCTG ACACAGATAG CATT CCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2830 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2 94 0 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 3060 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3240 

GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 3300 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA 3360 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 3420 

15 GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 34 80 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3540 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 3600 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACGAAGT TACCGGTCAT 3660 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 3780 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3840 

AAACTTTGTC AATGGTATCA CTTTAAATGT GAGAGATAAG AATGAATTAA AGCCATTTTA 3900 

TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3960 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4020 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4 080 

35 TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 4140 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4200 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4260 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4 320 

TATAGGTGCA TTGCATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 43 80 

TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 4440 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4 500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 560 

so TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 620 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4680 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4740 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4 860 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4920 

5 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

75 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 60 

CGCTTT TG CA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

25 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 4 80 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 540 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

35 TCACTAGAGG AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

ZH) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 
SO AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 
GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 
TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 

55 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 3 SO 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 420 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 480 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 54 0 

10 ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA €00 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC €60 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AGCATTATTA GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAG CT AATTT 840 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 1080 

25 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 
(2) INFORMATION FOR SEQ ID NO: 29: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

40 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 
GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 
AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 240 

45 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 300 
GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 
50 AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 4 20 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 540 
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10 



20 



25 



CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT 660 
ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 
AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 780 
TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 84 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 
GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 
TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 
75 TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 1080 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 114 0 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 1200 
CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 1260 
TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 1320 
CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 1380 
GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 144 0 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 
TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 1560 
CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 
ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 1680 

CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 1740 35 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 1800 
GGTATTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 1860 

■ 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 1980 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2040 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCGCTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 50 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AGCAATATTA TTGAAAGTGG TAATGGATTT 2280 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2340 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG 
CGATATGTTA ATCAAAACTA TCATATCGTG 

5 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT 
ATGTCATCGG GGCAAGCAGT GATTATTGAA 
GATATGGGAT TAGTTGTCAT TACTGAGCGT 

10 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT 
GTGGGAGATT ATATTGTTCA TGTGCATCAT 
15 CTCGAAGTGG GGCAAACGCA TCGTGATTAT 
CTATTTGTTC CAGTAGATCA AATGGATCAA 
ACGCCAAAAT TAAATAAACT CGGTGGCAGT 

OA 

. CAAAGTGTTG AAGATATTGC TGAAGAGTTG 
GAAGGTTATC AATATGGGGA AGACACAGCT 
TATGAACTTA CGCCTGACCA AGCTAAATCT 

25 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT 
GTGAGAGCAG CATTCAAAGC TGTAATGGAA 

nn ACTATTTTAG CTCAGCAACA TTATGAGACG 

GAAATTCAAT TAATGAGTCG TTTTAGAACG 
CTTAAAACTG GATTTGTTGA CATAGTTGTT 

35 CAGTATAAAG ATTTAGGGCT GTTGATTGTA 

AAAGAGCGTA TTAAAACATT AAAACATAAT 
ATAGCTAGAA CATTGCATAT GAGTATGCTA 

40 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA 
AAAGAAGCTT TAGAAAGAGA ACTATCCCGT 
GTGCAATCCA TTTATGaAAA ACGAGAACAA 

45 

GCAGTTGCTC ATGGACAAAT GACAGAGCGC 
AATAATgAAT ATGATATTTT AGTAACGACG 
SO AATGCAAATA CTTTGATCAT TGAAGATGCA 

TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT 
AATAAGGTAC TAACTGAGAC TGCAGAAGAT 

55 



CAATATGACA 


TTATGCGTTC 


TGAATTTCAA 


2460 


GTTTTGGTCG 


AAACCGAAAC 


TAAAGTTGAA 


2520 


ATTCCATCAA 


TAACAAAATT 


GCATCGCTCA 


2580 


GGCAGTTTAT 


CTGAAGGATT 


TGAACTACCT 


2640 


GAgcTTTTTA AATCAAAACA GAAAAAGCAA 


2700 


GAAAAAATTA 


AGTCTTACCA 


AGATTTAAAT 


2760 


GGTGTTGGTA 


GATATTTAGG 


TGTTGAGACG 


2620 


ATTAAATTGC 


AATATAAAGG 


TACGGATCAA 


2880 


GTTCAAAAAT 


ATGTAGCTTC 


GGAAGATAAG 


2940 


GAATGGAAAA 


AAACAAAAGC 


TAAAGTTCAA 


3000 


ATTGATTTAT 


ATAAAGAAAG 


AGAAATGGCA 


3060 


GAGCAAACAA 


CATTTGAATT 


AGATTTTCCA 


3120 


ATCGATGAAA 


TTAAAGATGA 


CATGCAAAAA 


3130 


GATGTTGGTT 


ATGGTAAAAC 


TGAAGTTGCA 


3240 


GGAAAGCAGG 


TTGCATTTTT 


AGTTCCTACA 


3300 


TTAATTGAGC 


GTATGCAAGA 


TTTTCCTGTT 


3360 


CCTAAAGAGA 


TAAAACAAAC 


TAAGGAAGGA 


3420 


GGTACACACA 


AATTACTTAG 


TAAAGATATA 


3480 


GATGAAGAAC 


AACGATTTGG 


TGTACGCCAT 


3540 


GTAGATGTAC 


TAACATTGAC 


TGCAACCCCA 


3600 


GGTGTGCGGG 

* 


ATTTGTCAGT 


GATTGAAACG 


3660 


TATGTATTAG 


AACAGAACAT 


GAGTTTTATC 


3720 


GATGGCCAAG 


TGTTTTATCT 


TTATAATAAA 


3730 


CTCCAGATGT 


TAATGCCAGA 


TGCTAACATT 


3S40 


GATTTAGAAG 


AAACGATGTT 


AAGTTTTATC 


3900 


ACGATTATTG 


AAACAGGTGT 


CGATGTCCCA 


3960 


GA7CGCTTTG 


GATTGAGTCA 


GTTGTATCAA 


4020 


ATTGGTTATG 


CATACTTCTT 


ACATCCAGCA 


4080 


CGATTACAAG 


CGATTAAAGA ATTTACGGAG 


4140 
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TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4260 
TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4320 

5 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 4380 
GCTAAAATTG AA 4392 
(2) INFORMATION FOR SEQ ID NO: 30: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
15 (D) TOPOLOGY: linear 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 60 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 120 

TTATC TTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 180 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 240 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 

CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATAC CAT 3 60 

TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 420 

TTGCAGTTTG ATAACCGATT TCTAAATTTT TTACATGCAT GACGTCATTA CCTGTATTCC 4 80 

35 GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 540 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 6 00 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 720 

AGCGTTTGA 729 
(2) INFORMATION FOR SEQ ID NO: 31: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
so (D) TOPOLOGY: linear 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT 
TCCTTACTAT CTTTAGCTTC AGATTCCTGT 
TGTCCTTCAA TATCAACTCG TGGAATAATG 
CCTTTtCCAA ACAATTTCGt TAATGCAGGA 
AAGAGTACAC CAAACGCTAA TGCCATACCC 
ACAAACGCAA AGAAGACACT AAACATAATT 
TCTTTCAATC CTACTTTGAT AGAATAATCA 
ATTCGCGACA TAAGGAAGAC TTCATAATCC 
ATAACCGGTA AAAATGCTAG CATTGGTCCT 
AAACCATCTT GCATTACTAA TGTTGTAAAT 
CCTAAAACTG CTTTTAATGG TATTAGAATT 
GCTAATACAA CAATGACTGA GGCAAATAAA 
TCAATATTAA TGACACTTTG TCCCGAAATC 
TCTTTATGAT AATCTCGTAA ATCATGCACT 
TGCTTAGGTA TCACGACCAT CAAAGCGTAA 
ACGATATCTA CATTTTTCTT ATCTTTAATA 
AATCCTTGTG GATCATCCTT TTTATCTTTC 
AATCCTTCAC CAAATTTATC CGAGATAATA 
GGTTTAACAC CGTCATCTGG AATACCAAGT 
ACTAATATGA TTAAACCTAG TAATACTGCC 
CATGGCGTAT CAATATCTTT TTTGAATTTA 
TGGAAAATGC TTATTAATGC AGGTAATAAA 
CTAATTGCCG AAGCAAATCC CATTACCGCT 
CATACTGCAA TTACAACTG7 TACACCAGCA 
GCAAGACCAA TGCCTTTAAT GTAATCTGTT 
AAAATAAATA ATGCATAATC GATACCAACT 
GTGACATTTG GTATATCGAA TGCATAAGTT 
AGACCAATCA ATGCACTTAT AATTGGTAAT 
AACAGTACAA CAAATGCAAC AATAATACCA 



GGATATACTT TAAAGGTTGT GTCGTATGTT 
GATTCAACCG TTTTATATTT TTCAAGTGCA 
CGATTCAACC ATGCTGGTAA ATACCACGAA 
ATTAACATCA TtCTGACTAC GAAGGCATCA 
ATTGATTTAA TCATGACATC TTCTTGGAAT 
AATGCAGCTG CTACAATAAC AGGACCGCTT 
TTATCCCCTG TTTTACTATm yyCTTCATGr 
ATCGCTAATC CAAATAAGAT ACCTATAGTA 
GTCGTTTCAA TACCAAACAG ACCTTTCATA 
CCTAATGTTG CCATTAATGA CAAGACGAAT 
GAACGGAAGA CAATCATTAA TAAGAAAAAT 
GGTATCGCCT CATTTAACTT TTTAGACATA 
TCCGTTTTGA ACCCATATTT ATCTTGTGCA 
AAATCATTTG TACTCTCTGC ATTAGGCCCT 
TCATTATCTT TACTCATTTG TGGTGGCGTA 
TCTTTATATA CAGACTGTAA ATCTTGTTGT 
ACATTTATCA ACATCGGTAT TTGGCCATTA 
TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 
CGCATATGAC 7AACTGGTAT TGCAGCTGCT 
GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 
GACTGTAATT TATTCACTTT AATGCGTTtA 
GTTAAAGCGC TAAGTACTGC AAAAACAACA 
AAGAAGTCAA TGCCTACTAA 7GATAAACCA 
AAAACAACTG CACTACCTGC TGTTCCTATT 
TCAGTTTTCA TAAC7TGTCG ATATCTGAAT 
GCTAGTCCAA TCATTACGGC TAATGTCAGT 
AACAAACTGA TAATACCTAC ACCAGAGGCT 
CCTGCAGCAA TGACTGAACC GAATGTGATT 
ACTAGTTCAG AATTACCGCC TACTTCTGTA 
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AAATGACTTT 


TAACATTATC 


TCTAGAGCCA 


TCTTTTAAAG 


ATGTTTGACT 


AACGTCATAT 


1920 




GTGATATCTG 


CAAATGCAGT 


TGTTTTATCT 


TTACTAATTT 


GCTTATTTTC 


ATAAGGATCT 


1980 


c 


GATATTTTAT 


CAATGTGCTT 


GTCATCTTTT 


TTAATATCAT 


CTAACGTTTT 


CTTAATATCT 


2040 




TTAGTAATGT 


TCGGTTGCAC 


AATACCATCA 


TCTTTAGTCG 


TCTTAAAGAC 


AACACGTATT 


2100 


10 


TGTGCCTTTT 


CACTATCTTG 


ATTAAAATGT 


TTTTCAATCT 


TTTTATTCGT 


ATCTAACGAC 


2160 




TCTAATCCTG 


TCATTTTAAT 


ATCATTGTCA AATTTCGGTG 


CATTTGTAGC 


AAGTGGTATC 


2220 




AATATTGCAG 


CTACAATCAC 


TATCCATGCA 


ATGACCGCGG 


ACCATTTATG 


TTTTGCGATG 


2280 


15- 


AATGTCCCCA 


TCTTATATAA 


AAATTTTGCC 


AAAGTATATT 


GCCTCCnTT 


AAAATCAACG 


2340 




TTATAGTTTA AATATACAGT 


GTAGATTATT 


GTTCGATTAT 


AGTATCTATC 


CCCGACCTCT 


2400 




TAAAGAATCA ATTGGAAAAT 


TTTGTATATT 


AAACTACACA 


CAAAGGAGAA 


ATGTAGATGA 


2460 


20 


AAGAGACTGA 


TTTACGAGTT 


ATAAAGACAA 


AAAAAGCATT 


GTCGAGTAGC 


TTGCTACAAT 


2520 




TGTTAGAACA GCAATTATTC 


CAAACGATTA 


CTGTCAATCA 


AATTTGCGAC 


AACGCACTCG 


2580 


25 


TACACCGTAC 


AACATTTTAT 


AAACATTTTT 


ATGATAAATA 


TGATCTTCTA 


GAGTACTTGT 


2640 


TCAATCAATT 


GACTAAAGAC 


TACTTTGCTA 


GAGATATCAG 


TGACCGTCTT 


AATCATCCAT 


2700 




TCCAAACGAT 


GAGTGATACG 


ATTAATAATA 


AAGAGGATTT 


GAGAGAAATC 


GCAGAATTCC 


2760 


30 


AAGAAGAAGA 


CGCTGAATTT 


AATAAAGTAT 


TAAAAAATGT 


CTGCATTAAA 


ATTATGCATA 


2820 




ACGATATCAA 


AAATAATAGA 


GACCGTATCG 


ATATTGACAG 


CG ACATC CCA 


GATAATCTCA 


2880 




TATTTTATAT 


TTATGACTCG 


TTGATTGAAG 


GTTTTATACA 


TTGGATAAAA 


GATGAAAAAA 


2940 


35. 


TTGATTGGCC 


TGGCGAAGAT 


ATTGATAACA 


TTTTCCATAG 


ATTAATCAAT 


ATTAAGATTA 


3000 




AATAGTAGAT 


GAGAAACTCA. 


TGAGCGTTAC 


CAACATTCAT 


AATAAAAACG ATAGTGkACA 


3060 




CGTTAATGAA 

• 


TTCGTGTACT 


ACTATCGTTT 


TTTATTTTTA 


TCGTGCTTAT 


CGCTATTAAA 


3120 


40 


ACAACTGATA 


CACAACACAT 


AAACTATGAA 


GAAAAAAATA 


AATCCGCTAT 


CTAAATGACT 


3180 




TTGACTCAGT 


TGTTTAAATG 


ACCAAATTGC 


TAATACAATT 


CCCATTATTA 


TTGAAATAAC 


3240 


45 


GTATCTCACA 


TTCTTATACC 


TATAATCCTT 


TTCTAAAAAT 


ATGGTTGCTA 


TTACTTAATT 


3300 


TTTAAAGTTA TAAATAAAAA GAGCCAACCG 


CAATGGATGG 


CCCTTGTTCA 


TTATGAAGCA 


3360 




TTAGAACATT 


TCTGAAACAA 


CCTTTTGTTC 


TAAGAAGTGT 


AATAAGTAGT 


CTGGACTACC 


3420 


SO 


TGTTTTAGCG 


TCCGTACCTG 


ACATTTiGAA 


ACCACCAAAT 


GGATGGTATC 


CAACAACTGC 


3480 




TGAAGTACAG 


CCTCTGTTAA 


GGTATAAATT 


GCCTACATCA 


AATTCGTTTA 


CCGCTTTAAT 


3540 




CCAATGCTCG 


CGATTATTTG 


TAATCACTGC 


ACCAGTTAAA 


CCGTAATCTG 


TATCATTTGC 


3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3720 

GTAACCTTTT GAATCATCAG TGCCGCCACC TTGTTCTAAT TTACCTTCTT CTTTACCAAT 37 80 

CTCAATATAA TTTTTAATCT TATCAAATTG TTTTTTATTA ATAACTGGGC CCATATACGT 3 840 

ATTGTCTACA GTATTGCCCA ACGTTAATTC TTTTGTTAAT TTGATTGATT TCTCTAATAC 3900 

TTCGTCATAA ACGTCTTTAT GCACAATTGC ACGTGAACAT GCTGAACATT TTTGACCAGA 3960 

AAAACCAAAT GCTGACGTTA CAATAGCTTC TGCTGCCATA TCTGTATCAA TATTTTCATC 4020 

AACTACAATG GCATCTTTAC CACCCATTTC AGCGATAACA CGTTTCAAGA AGTTTTGACC 4080 

15 TTCTTGAACA ACGGCACTAC GTTCATAAAT TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TGTAACGAAA TGCGTATCTT TATGATCAAC TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

AGGAACAAAG TTAACTACGC CTTTTGGTAA TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

TGGTGCTAAA GTTGTAC CAG CCATAATCGC AAACGGGAAG TTCCACGGCG GAATTGTAAC 4380 

ACCTGTACCA ATTGATTTAT AGAAATATTT ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

CTTACCTTGA GCCAAGTCCA TCATTGAACG TGCATAGTAT TCAATAAAAT CAATAC CTTC 4 500 

AGCTGCATCA CCAACTGCTT CATCCCATGG CTTACCTGCT TCATAAACCA TAATTGCTGC 4560 

AATTTCCGCT TTTCGACGAC GAATAATTGC CGAAACACGT AACATAAGCT CTGCACGATC 4 620 

ATTTGCTGAC CATGTTTTCC AAGATTTATA AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4680 

AACATCTTGT TTTGTTGCCT TTGATGCATT TGCAATCACT TGTGATGTGT CTGCAGGATT 4740 

35 GATTGATTTA ATTTTGTCAT CTTTGAAAAT CTTCTCTCCA TTAATCACTA ATGGTATGTC 4 800 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA TGCTTTCTTA AACATATCCA CATTTTCTTG 4860 

GACTGAAAAA TCGTAACCAG GTTCATTTTT AAATTCTACT ACCATGTACA CTTACCCCCT 4920 

ATAAATTTTG AAAGTGGTTT AACCCTTTGA TTTAATGATA TAACATCATT TAAACTGVTT 4980 

TTACTATGAT TAAGGTTAGT TTTGCAATCG CTTTCATTTT TATGTTTTAT CACTTATTCT 504 0 

CAAGTATTTT GAAATTGATT GGTTACTTTT TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TATCGTTTCG TCATTTAATG TTTCGGATGG TAGGTCATTA TCAATTTTAC GAACGACTTT 5160 

ACAAGGGTTT CCAACCGCTA AGCTGTGTGG CGGAATATCT TTAGTGACAA CACTACCAGC 5220 

50 ACCAATCACA CTGCCTTCTC CAATCGTCAC CCCTGGTAAC ACGGCTACAT GACCGCCAAA 5280 

CCAAGTATTA CTGCCAATAT GAATGGGTCC GGCTTTTTCA AAACCTTCAT TTCTATGATG 5340 

GAAATTAAGT GGATGTGTCG CTGTGTAGAA TCCACAATTA GGTCCTATAA AAACATTATC 5400 

55 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 5520 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 5580 

5 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 5640 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 5700 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

10 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

15 AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 5940 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

20 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 6240 

25 

ATGTTTGCAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

30 AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 6420 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 6480 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 6540 

35 AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6780 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 684 0 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6900 

45 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

SO GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 
TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 
GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 
GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 
AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 
AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 
TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 
TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 
AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 
GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 
CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 
GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 
TCGACATACG TTGCAATTTT AGCATTAGGA AACGGtCGTA TGCGACCACA TCACTTTGAT 
AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 
CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 
GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 
ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 
TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 
CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 
AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 
CTACOTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 

— 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 
TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 
GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 
GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 
AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 
SO GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 
TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 
TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 
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CCATTTTC CG 


AAGCAAGTAT 


ACTAGGATTT 


TTAGCGTCGA 


AACCGAAAGC 


ACCATAAGCA 


10920 




AAATCATGCA 


CGATTTTAGT 


GTCTGTACCT 


TTAAATTTAG 


CTATCGCTTC 


ATCAAAAACT 


10980 


5 


TCTTTCGTAG 


CTGTCGATCC 


AGTTGGATTA 


TTTGGATACG 


TTAAATAAAT 


GAGTTTTGTT 


11040 




TTATCTATTA 


TTTGTGAATC 


AACTTTGGAC 


CAATCTGGCA 


AATAATGTGG 


CGGTTCTAAA 


11100 


10 


TTAAGCGGGA 


CTGGCTTGCC 


ATCAGCTAAA 


AGTACACCTG 


CTAAATAATC 


CGTGTAGCCT 


11160 


GGATCAGGTA 


GTAATACATA 


GTCTCCTGGA 


TTGATAACAC 


ATGTTGGTAC 


TGCCACTAAT 


11220 




CCATTTTTTG 


TACCA7ATAA 


AATGCATACT 


TCATCTTCTT 


TATCTAACGT 


CACATTATAT 


11280 


15 


TGTCTTTGAT 


AAAAATCTAC 


AATAGCTTGC 


TTGAACGCTT 


CTTTACCATG 


AAAAGCACCA 


11340 




TATTTTTGAT 


TTTCAGGAAT 


AGTTAGTGCT 


TTTTGAAAAT 


GATCAATAAT 


ACCTTGTGGC 


11400 




GTGGGCCCAT 


CAGGGATTCC 


AACTGCCATA 


TTAATTAATG 


GCAATGGTCC 


ATGTTCGATT 


11460 


20 


TTACGTCCCA 


TCGTTTTCCC 


GAAATAACTA 


TCAGGGATAT 


TTGCTAATTT 


GTTAGAGATC 


11520 




ATCAAATTCC 


TCCTCTATCA 


TTAAACATAG 


CCTGGGCGAC 


TATCATAATC 


CTAACAACTT 


11580 


25 


GTATCACTCT 


CATTTAGATG 


GTTACAATGA 


CATCGCCATT 


CACCGTTATG 


TTCAACAGAA 


11640 


CTTATGACAC 


ACGTTGTATT 


GAATGAATTT 


ATTTTCATTT 


TAGGTAGGTA 


TAATATTATT 


11700 




GTCAATATTA 


GGAATTTTCA 


GATTAATATG 


CACTCAATCG 


TTATGATTTA 


ACTGTCATGC 


11760 


30 


ATATCCGCAT 


GCGCAACCAG 


TTAGATATGC 


TTATATAAAG 


TATAACGCCC 


ATCAAGGTAC 


11820 




GTATTCAAAC 


GTGAACCTTA 


ACAGGCGTCA 


TTCATTGTTA 


AATAAAACTT 


CTTAAGCACA 


11880 




TACTTATTTC 


ACTATGCCTT 


TTACGTTCCC 


CTTATACTTT 


TCTCACATCT 


TTCTCTTAGA 


11940 


35 


CTACTCCCTT 


ATACGCCCCG 


CTCAATATCT 


TTAATCATTT 


CATCTACAGT 


TATTTTCGCA 


12000 




CTCGTTAAGA 


CAATAGGAAC 


GCCTGCACCT 


GGATGCGTAC 


TTGCACCTGC 


AAAATATAAA 


12060 




TCTTTATAAT 

i 


CTCGCGATAC 


ATTTTGTGGA 


CGATAATAAT 


TACTTTGCGC 


TAAAGTTGGC 


12120 


40 


ATTAAACCGA 


ATGCCGAACC 


AAATTTCGCA 


TGATACGTTT 


GCTCAAAATC 


ATTTGGCGTA 


12180 




AAGATTGTTT 


CTGAAACAAT 


ATGCGATTTT 


ATATCTTCAA 


ATACTTCAAT 


CGTTGCTAAT 


12240 


45 


TTACGATAAA 


TAATTTCCTT 


TATTTGTTGC 


GTCAAAGCTT 


CATCTGACCA 


ATCGATTCCG 


12300 


CTACCTGTTT 


TAAGTTCCGG 


CGTCGGCATT 


AGCACATAAA 


TACCAGTTTT 


GCCTTCTGGC 


12360 




GCAAGTGATT 


TATCAGCGAC 


CGCTGGTACA 


TACACATAAA 


TAGAAGGATC 


ATATGATAAA 


12420 


50 


CGTCCCTCAA 


ATATTTCTTC 


AATATTGCCT 


CTAAAGTCAT 


CTGAAAAAAT 


AACATTATGA 


12480 




AGTCTCACTT 


GATCTGTCAC 


ATCAATATCT 


ATACCGATAT 


ACATTAAAAA 


TGCTGAACAA 


12540 




GAGTAATCTA AGTCTGCAAT TTTATGTGGT 


GGATACTTTT 


TAATAGGTGC 


AAAATCTGGC 


12600 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12 720 
TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 
5 TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA 1284 0 
GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 
AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12960 

10 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13020 
GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

75 TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 
TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 
TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

20 AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 
CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 
ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 13440 

25 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTA7TTCAT 13500 
GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 
CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC 7GCTAATTCT ATGATTGGTT 13620 

30 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 
TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 
35 CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13800 
CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13 856 

(2) INFORMATION FOR SEQ ID NO: 32: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
so ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 
AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 
AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA A7AGACAAAG GATGACGATT TATGTATATC 300 

AATATGAAAG ATTATGGGTT AACAGGCATA AACAAAACTA AAGATACTCG AGCAATACAA 360 

CGTGCGTTAA ATCGTGGAAG ATGTAAACCA ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

GATATTTGCA AACCATTAAC GATATATGGC AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

ATTTTACGCC GATGTCATTC TGGTCCTTTA TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 540 

CGTGGTTATA ATGGACACAG TCATATTCAT ATTAAAGGCG GCAAGTTTGA TATGAATGGT €00 

GTATCGTATC CTTATAACAA TACAGCTATG TGCATTGGGC ATGCTGAAGA TATTCAATTA 660 

15 ATAGGTGTGA CCATTAAGAA TGTAGTGAGT GGTCATGCAA TTGATGCTTG TGGGATTAAC 720 

GGACTCTATA TTAAAAGCTG TTCATTTGAA GGATTCATAG ACTATAGTGG CGAACcTTTT 7 80 

ATTCTGAAGC AATACAATTA GACATTCAAG TACCTGGTGC TTTTCCAAAA TTCGGAACgA 840 

20 CAGATGGTAC GATAACGAAA AATGTCATTA TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

ACTATGAGAA TATTCATATT AGAAATAATA TATTTGAAGA TATACAAGGT TATGCATTAA 1020 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

GCATTAGATA TTTAGGAGTT AGAGATGGTA AAAATGCAGC AGATGTGaTG ACAGGaAAAG 1140 

ACTTAGGTTC CCAAGCAGGC ATAAATATGA ATATAATTGG AAATGAATTT AAAGGATCAA 1200 

TGTCTAAAGA TGCGATACAT GTACGTAATT ATAATAATGT TAAACATAAA GATGTATTAA 1260 

TCGTTGGGAA TACATTCAAT AATTCGACTC AATCAATTCA TTTAGAAGAT ATTGATACAG 1320 

35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC AAGTTACTAC AATCAATGTA GATGAAATAA 1380 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA TTAAGAATAG TAGATAATTT TTGAAAGCGC 1440 

ATTGATAAAA CGGTATAAAT ATGCTATAAT AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

* 

40 TGACGGTAAT GATAATACAA GATAGACAAC TTTCTATACT CTAATATAGT GAGTTGAAGT 1560 

AGCTTGTCAT AATCATCATG AGGGGGAAAT TTATGGCTTA TTTCAATCAA CATCAATCAA 1620 

TGATATCGAA AAGGTATTTA ACATTCTTTT CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 1680 

CGGGACAACT TATTGGACTA ATATTAGGTC CATTACTTTT CCTATTAACA TTATTATTCT 1740 

7TCATCCACA AGACTTACCT TGGAAAGGCG TCTATGTTTT AGCGATTACT TTATGGATTG 1800 

CGACTTGGTG GATTACTGAA GCAATTCCTA TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

TATTACCATT AGGTCATATA CTTACACCAG AACAAGTATC ATCCGAATAT GGCAATGATA 1920 

TTATC T T TT T GTTTTTAGGT GGATTTATTT TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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TTGGATTCAX 


GGTGGCAACA 


GGATTCTTAT 


CTATGTTTGT 


ATCGAACACX 


GCAGCTGTAA 


2100 




TGATTATGAT 


TCCGATTGGT 


TTAGCAATTA 


TTAAGGAAGC 


ACATGATTTA 


CAAGAAGCCA 


2160 


c 
o 


ATACGAATCA 


AACAAGTATT 


CAAAAGTTTG 


AAAAATCTCT 


AGTTTTAGCA 


ATTGGCTATG 


2220 




CAGGTACGAT 


TGGTGGCTTG 


GGTACATTAA 


TCGGAACCCC 


GCCATTAATT 


ATTTTAAAAG 


2230 


10 


GACAATACAT 


GCAACATTTT 


GGACATGAAA 


TTAGTTTTGC 


TAAATGGATG 


ATTGTAGGGA 


2340 


TTCCAACGGT 


CATTGTTTTG 


TTAGGTATTA 


CTTGGCTCTA 


TTTAAGATAT 


GTTGCGTTTA 


2400 




GACATGATTT 


GAAATATTTa 


CCTGGTGGTC 


AGACGTTAAT 


TAAACAAAAG 


TTAGACGAGC 


2460 


15 


TTGGCAAAAT 


GAAGTATGAA 


GAAAAGGTAG 


TACAAACTAT 


CTTTGTACTT 


GCTAGCTTAT 


2520 




TATGGATTAC 
GTACGATTGC 


AAGAGAGTTT 


CTTCTGAAAA AATGGGAAGT 


TACGTCATCT 


GTTGCAGATG 


2580 




TATTTTTATA 


TCAATATTAT 


TAI I TATTAT 


TCCAGCTAAA AATACTGAAA 


2640 


20 


AACATCGCCG 


TATCATTGAC 


TGGGAAGTTG 


CAAAAGAGCT 


CCCTTGGGGT 


GTATTAATTT 


2700 




TATTTGGTGG 


CGGTTTAGCA 


TTAGCGAAAG 


GTATTTCTGA 


AAGTGGTTTA 


GCAAAATGGT 


2760 




TAGGCGAACA 


GTTGAAATCA 


TTAAATGGTG 


TTAGTCCGAT 


TCTTATTGTA 


ATTGTCATAA 


2820 


25 


CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 


2880 




TACCGATTTT 


AGCAACGTTG 


TCTGTTGCTG 


TTGGAGTGCA 


TCCATTACTA 


CTTATGGCAC 


2940 


30 


CTGCAGCTAT 


GGCGGCTAAC 


TGTGCATACA 


TGTTACCAGT 


AGGGACACCA 


CCGAATGCAA 


3000 


TTATCTTTGG 


TTCTGGTAAA 


ATATCTATCA 


AACAAATGGC 


ATCAGTAGGA 


TTCTGGGTAA 


3060 




ACTTAATCAG 


TGCAATAATT 


ATTATTTTAG 


TCGTGTATTA 


TGTAATGCCT 


ATAGTTTTAG 


3120 


35 


GTATTGATAT 


AAATCAACCA 


CTGCCATTGA 


AATAGTAATT 


GCAGATTAGA 


ACGAAAAATA 


3130 




AAAGGTTACA 


TTAGCAATTG 


CTTGGACGAG 


TGGTAACGAA 


ACGTATACCG 


CAGCATCGTG 


3240 




TAASAACAAT 


ACAAACAAAA 


GAAAGTCAAC 


CAAGGATGGA 


TTCCTATTTT 


AATCCTTGGT 


3300 


40 


TGACTCTTTA TTTTATTTAA 


ATTGTAGAAC 


CTAGAAAATA 


AAGTTTAATT 


AAAAGCACCA 


3360 




ATCATTTCTA 


CTTTGAAATC 


TAAGGTTTCT 


AAAATAGCAA 


TGACTTTCTT 


TATATCGGTT 


3420 


45 


GTAATTGCAG 


AATCAGCCTG 


AACGAAAAAT 


CGATACATAC 


CTAATTGTGT 


TriTAAAGGA 


3480 


CGAGACTCAA 


TCCAGGATAA 


ATTAATATTA 


AACAAAGCAA 


ATGTATTAAG 


CACACTTGCT 


3540 




AACAACCCAG 


GTTTATCATG 


CATTGGTGTA 


ATTAAAAACA 


TCAATGATGT 


CGCATTTTGA 


3600 


SO 


TCAAATTGCT 


GCTGATTTTT 


TATAACTAAA 


AAACGTGTCA 


CGT7ATGTGG ATAGTCTTCA 


3660 




ATATGTGTAT 


CAATAGGTGT 


AAAACCATAA 


GctTCGCCAC 


TACCTAAAGG 


TGCAATTGCT 


* 

3720 




GCAACGCCAT 


TTTCAATTTT 


AGTCAAACTT 


TGAATTGTAC 


TGTCGACATA 


ATCATAGTCA 


3780 
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TTAAAGTCTA TTGGCTACAA GGATGATTTC TTATCATATT TAAAAGATTT AAAATTCACA 5700 

GGCAGCATCC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 5760 

GTAGAAGCAC CATTGATTCA AGCGCAATTA ATAGAAACAA TTTTATTAAA CATTGTAAAT 5820 

TTCCATACAT TAATTACAAC AAAGGCTAGC AGAATTCGTC AAATTGCATC AAATGATAAA 5880 

TTAATGGAGT TTGGTACACG TCGTGCGCAA GAAATTGATG CAGCATTGTG GGGCGCTAGA 5940 

GCTGCTTACA TCGGGGGCTT TGATTCTACA AGTAATGTTA GGGCGGGGAA ATTATTTGGT 6000 

ATACCTGTGT CTGGTACACA TGCACATGCA TTTGTCCAAA CTTATGGAGA CGAATATGTT 6060 

15 GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 6120 

ACTTTAAAAT CTGGCGTGCC AAATGCAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 6180 

AACTTTGTAG GTATTCGATT AGATTCTGGA GATATCGCTT ATTTATCTAA AGAGGCAAGA 6240 

CGTATGCTTG ATGAAGCAGG ATTTACTGAA ACTAAAATTA TCGCGTCTAA TGATTTGGAT 6300 

GAAGAAACGA TTACGAGTTT GAAAGCACAA GGTGCAAAAG TAGATTCTTG GGGCGTTGGT 6360 

ACAAAGCTGA TTACAGGATA CGATCAACCA GCATTAGGTG CAGTATATAA ACTTGTAGCT 6420 

ATTGAAAATG AAGATGGTTC ATATAGTGAT CGTATTAAAT TATCAAATAA CGCTGAAAAG 6480 

GTTACGACGC CAGGTAAGAA AAATGTATAT CGCATTATAA ACAAGAAAAC AGGTAAGGCA 6540 

30 GAAGGCGATT ATATTACTTT GGAAAATGAA AATCCATACG ATGAACAACC TTTAAAATTA 6600 

TTCCATCCAG TGCATACTTA TAAAATGAAA TTTATAAAAT CTTTCGAAGC CATTGATTTG 6660 

CATCATAATA TTTATGAAAA TGGTAAATTA GTATATCAAA TGCCAACAGA AGATGAATCA 6720 

25 CGTGAATATT TAGCACTAGG ATTACAATCT ATTTGGGATG AAAATAAGCG TTTCCTGAAT 6780 

CCACAAGAAT ATCCAGTCGA TTTAAGCAAG GCATGTTGGG ATAATAAACA TAAACGTATT 6840 

TTTGAAGTTG CGGAACACGT TAAGGAGATG GAAGAAGATA ATGAGTAAAT TACAAGACGT 6900 

TATTGTACAA GAAATGAAAG TGAAAAAGCG TATCGATAGT GCTGAAGAAA TTATGGAATT 6960 

AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 7020 

TATTTCAGGA GGACAGGATT CTACATTAGT TGGAAAACTA GTACAAATGT CTGTTAACGA 7080 

ATTACGTGAA GAAGGCATTG ATTGTACGTT TATTGCAGTT AAATTACCTT ATGGAGTTCA 7140 

AAAAGATGCT GATGAAGTTG AGCAAGCTTT GCGATTCATT GAACCAGATG AAATAGTAAC 7200 

50 AGTCAATATT AAGCCTGCAG TTGATCAAAG TGTGCAATCA TTAAAAGAAG CCGGTATTGT 7260 

TCTTACAGAT TTCCAAAAAG GAAATGAAAA AGCGCGTGAA CGTATGAAAG TACAATTTTC 7320 

AATTGCTTCA AACCGACAAG GTATTGTAGT AGGAACAGAT CATTCAGCTG AAAATATAAC 7380 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7520 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 7680 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7740 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7860 

r5 CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

20 AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 8280 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 8340 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 8400 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 8460 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8520 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaGATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 8640 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

* 

40 CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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CATTTAAATC TTGAGGATGC CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 
AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 
TTCTAAAATA CAAATTTTAA TAACCATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 
TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 
TCCATATAAT TATAACCTCT TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 
CATGCCCTGC GTGCATACCA TTTCTTGATT CTACTCTACT ACCTAAAACA GCAATTCCTT 
TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 
GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCAGTAAATT 
GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 
TGTTAGTATT ATAAGTCGCA TTTAGTAATT CAGACATCGT ATAGCCTGCA CACCAACCAT 
TGTTACCTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 
AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 
GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 
GACGTGGTGT TTTTAGTACT AGTTTAGCTT TATGATTTTG AGTACCACAT AGTAACCTTT 
TGAGTTGT 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

r (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

* 

CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 
TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 
TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 
TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 
ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 
TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 
TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 
TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 
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TGCCACACTC CTTTTTGATT GAATTAGCAT TTTACGATCA TAAACAGTCA TTATAATTGA 600 

GTATTTGAAC ATAAAAATGT AATTTTATCG TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

GTAATTTATG ATTGAAAAGT GAAAGCGTAC TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

GATGATAATT ACTGaAAAAA GACACGAGTT AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TTTGACTTTA CAAGAATTAA TAGATCGAAC TGGTTGCAGT GCTTCAACAA TACGArGAGA 840 

TTTATCTAAA CTACAACAAT TAGGGAAATT GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

AGAAAATCGT ATGGTTGAGG CGAATTTAAC TGAAAAATTA GCAACGAATC TTGATGAAAA 960 

GAAAATGATT GCTAAAATAG CAGCTAATCA AATCAACGAT AATGAATGCT TATTTATCGA 1020 

TGCTGGTTCA TCTACATTGG AGCTAATTAA ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

AACCAATGGT TTAACACATG TAGAAGCTTT ACTTAAAAAA GGTATTAAAA CAATTATGCT 114 0 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

AAGACGATAT TGTTTCGATA AAG CTTTTAT CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

ATTAACTACT CCCGATGAGC AAGAGGCATT AGTTAAACAA ACAGCAATGT CATTAGCCAA 1320 

TCAATCATTT GTACTTATAG ATCATTCTAA GTTTAATAAA GTATATTTTG CTCGTGTACC 1380 

TTTGCTAGAA AGTACGACAA TCATCACATC TGAAAAAGCA TTAAATCAAG AATCGTTAAA 144 0 

AGAATACCAA CAAAAGTATC ACTTTATAGG AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATCCTTCAA TTGACTATGT CATTTTTACG AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

GCAACAGCAA CATATAAATT CGCTGGGGGG AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 162 0 

ACATTGGATG TTGAGTCAAC TGCCTTGGGA TTTGCAGGTG GATTTCCTGG GAAATTCATT 1680 

ATAGATACAT TAAATAACAG TGCAATTCAA TCGAATTTTA TTGAAGTTGA TGAAGATACA 1740 

CGTATTAATG TGAAATTAAA AACAGGACAA GAAACAGAAA TCAATGCACC GGGTCCTCAT 1800 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT AGTATTCCAA GCGATGCGTA TGCGCAAATT 1920 

GCACAAATTA CAGCACAGAC AGGTGCTAAA TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

GAAAgCGTTT TACCATATCA TCCACTATTT ATTAAACCTA ATAAAGATGA ATTAGAAGTG 2040 

ATGTTTAATA CAACAGTGAA CTCAGACACA GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

GATAAAGGTG CGCAATCTGT TATTGTCTCG CTTGGCGGTG ATGGTGCTAT T7ATATTGAT 2160 

AAAGAAATCA GTATTAAAGC AGTTAATCCA CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GGTGATAGTA CAGTTGCAGG CATGGTGGCT GGAATTGCTT CAGGTTTAAC GATTGAAAAA 22 80 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA TGGGGAGTGA 2400 

AAATAATGAG AGTAACAGAG TTATTAACAA AAGATACAAT AGCAATGGAT TTAATGGCAA 2460 

ATGACAAAAA TGGTGTTATT GATGAGTTAG TAAATCAATT AGACAAAGCA GGTAAATTAA 2520 

GTGATGTCGC GTCATTTAAG GAAGCGATTC ACAATCGAGA ATCACAAAGT ACAACTGGTA 2580 

TCGGCGAAGG TATTGCCATT CCACATGCCA AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 2640 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 2700 

TATTCTTTAT GATTGcAGcG CCAGAAGGTG GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 2760 

AGTTGTCTGG TATTTTAATG GATGAAAATG TACGTGAGAA ATTATTACAT GCTTCATCAC 2820 
CTGAAGAAGT ACTAGCGATC ATAGATGAGG CTGATGATGA AGTGACAAAA GAAGAAGAGG . 2880 

CAGAAGCTGA AGCACAACAA GTTGCAACTG CAGAACAATC ATCTAAACAA TCTAATGAGC 2940 

20 CATATGTGTT AGCAGTAACT GCTTGTCCAA CAGGTATTGC ACACACATAT ATGGCACGTG 3000 

ATGCATTGAA AAAGCAAGCG GATAAAATGG GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

CAAGCGGCAT TAAAAACCAT TTAACTGAAC AAGATATTGA AAATGCAACA GGTATCATTG 3120 

25 TTGCTGCTGA TGTTCATGTT GAGACGGATC GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG AATTAATTAA TAAAGCATTA GATACAAGTC 3240 

GTAAACCTTT TGTTGCCCGT GATGGTCAAA GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 33 00 

AATTAAGCCC AGGTAAAGCA TTCTATAAAC ACTTAATGAA CGGTGTTTCT AACATGTTGC 3360 

CACTTGTAAT ATCTGGTGGT ATTTTAATGG CAATTGTATT TTTATTTGGA GCAAATTCAT 3420 

TTAATCCAAA AAGCTCAGAG TACAATGCGT TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 34 80 

AAAGTGCATT CGCGTTAATC ATTCCAATTT TATCTGGATT CATTGCACGT AGTATTGCGG 3540 

ATAAACCTGG TTTCGCTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3600 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT TAGCAGGTTA CTTAACACAA GGTGTTAAAG 3660 

CCATGACACG TAAGTTACCA CAAGCATTAG AGGGATTAAA GCCAACATTA ATTTATCCAC 3 720 

TATTAACAGT GACGGCTACA GGCTTATTGA TGATTTATGC CTTTAATCCA CCAGCATCTT 3780 

GGTTAAATCA TTTGTTATTA GATGGATTAA ACAATTTATC AGGTTCTAAT ATTGTATTAT 384 0 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA TTGATATGGG CGGTCCATTC AACAAAGCGG 3900 

CATATGTTTT TGCAACAGGT GCGTTGATTG AAGGTAATGC AGCACCAATT ACAGCTGCAA 3960 

TGATTGGTGG TATGATTCCA CCGTTAGCAA TTGCGACAGC GATGTTAATT TTTAGACGTA 4 020 

AATTTACAAA AGAACAACGT GGTTCAATTA TCCCTAACTA TGTGATGGGT ATGTCATTTA 4080 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 
5 TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

10 

TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 
CAAATGTATA GACTTTTTTA ATATTTTGCA 

15 

AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 
20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 

TTTATTGATA TACATATTCA TGGTGGTTAT 

25 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

* 

ACAATGACGC AATCGACTGA TAAAATAGAT 
GCGGAgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 
GAAATTGAAG GTGCAAAAGA AGCGCTTGAA 

3o 

GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 
40 GCAGCATGGT TGAATGATGC TCTACATACC 

CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAATCA AGCCATTGCA 

50 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT AGGTTCACGA ATTACTGCGC 4200 

CTGATGGTGC ACACTTACTT CAAACTCTTA 4260 

CATTAATTTA CGGTTTAATC AAACCAAAGT 4320 

CAATGGACGA GTAGTTTTAA TGATGTAAAA 4 380 

TGTATGTTCA ATGAATATAT GTTAGTTTTA 444 0 

CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4500 

TTAGATAAAT GAATGAAGAA ATAGACACCA 4560 

AAAAGTTATG CCAAACGAAG CAGATATAGT 4620 

TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4680 

GAAGATGGCA AAATCGATAA TGGTTACATT 474 0 

GGAGAAGTGG ATGATAAAGC AGCAATTGAT 4 800 

GATGCTAAAG GTCATCATGT ATT AC CAGGT 4860 

GGTCAAGATG CAATGGATGG GTCATACGAT 4920 

TCTGAAGGGA CGACATCATA CTTGGCCACT 4980 

AATGCACTTA CAAATATTGC TAAATATGAA 5040 

ATTGTAGGTA TACATTTAGA AGGACCATTT 5100 

CCGCAATACG TTGTACGCCC ATTTATCGAT 5160 

GGATTAATAA AGATTATGAC GTTTGCACCT 5220 

ACGTATAAAG ATGACATTAT TTTTTCAATT 5280 

GTTGAAGCTG TTGAGCGAGG AGCTAAACAT 5340 

TTCCAACATA GAGAACCAGG TGTTTTTGGA 5400 

GAAATGATTG TTGATGGCAC TCATTCTCAT 54 60 

AAAGGTAATG AACGTTTTTA TTTAATTACC 5520 

GGAGAATATG ATTTGGGTGG ACAAAAAGTA 5580 

AATGGTGCGC TTGCTGGTAG TATTTTAAAA 5640 

TTTACAGGTG ATACATTAGA TCATTTATGG 5700 

TTAGGTATCG ATGATAGAAA AGGTAGTATT 5760 

CTAGATGATG ATATGAATGT AAAATCTACA 5820 

TAATAAATAA TCATAATTAA ATGTATGCAA 5880 



324 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 
ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 
AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 
TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 
AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 
TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 
AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 
TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 
TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 
GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 
TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 
CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 
TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 
ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 
TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 
TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 
CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 
GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 
TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 
AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 
TGATSATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 
CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 
TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 
AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 
ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 
AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 
GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 
CGA 

(2) INFORMATION FOR SEQ ID NO: 34: 



6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7563 
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(A) LENGTH: 34 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

10 TTATATCAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 60 

SATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

1S ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 240 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 300 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

20 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 420 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 480 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 540 

25 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AGCAGCATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

30 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 78 0 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

40 ' 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 1080 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 1140 

CGAT7AATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 1200 

45 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

50 AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 1380 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 144 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1630 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 1740 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 1980 

15 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 2040 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTT CAATT AT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 22 80 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 2340 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 2460 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 264 0 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACA5ACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2920 

* 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2 8 80 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 294 0 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 306 0 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 312 0 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3240 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT T7TAACGCTT 3300 

55 



30 



35 



45 



50 



327 



EP0 786 519 A2 



CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 
TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 
TCCACATATG CT 



3420 
3480 
3492 



10 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 
CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 
TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 
CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 
AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGC CATTA ATCAATTTAA 
TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 
TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 
AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 
GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 
GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 
AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 
CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 
AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC TtiAAGATGAAC TTAAAAATnG 
CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 
TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 
TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 
TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 
CATTATTAGA TCACGAACAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 
TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 
AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 



60 
120 
130 
240 
300 
360 
420 
480 
54 0 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 132 0 

TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 1380 
CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG *" 1440 

CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 

AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 156 0 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 1620 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT GGCACGTGGT GGTATTATTG 1680 

ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT TAGTCGGGCA GCTATCGATG 1740 

TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 

CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT GCACCTAAAA 1920 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 
(2) INFORMATION FOR SEQ ID NO: 36: 



15 



25 



30 



35 



45 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 

TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 

AAAT^AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 18 0 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 300 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 

ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 420 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 480 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 54 0 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 600 

TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 
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GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 7 80 

TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 
CTATGCAAAT AAAATATTCC ATAACAAAGT ATATACTTTA CATTTTTATA ATTCTTAACA 900 
ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 1020 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 1080 

CATCGTGGAA ACGGAAGCTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGGCGG 1140 

TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGTGGT GGTACAATTT ATGCACATGT 1200 

CATGCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CTGAAGGCGT 12 60 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 1320 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 13 80 

GGCTATCGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 1440 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 1560 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATT ACT AT CA 1620 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 1680 

TGATTGTTTT TCTTTGTATC CATCATATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 1740 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 1800 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG I860 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAGGG 1920 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAGCTT . GTATCCTTTA 1980 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 2040 

ACCAGTGATT GGCGGCTTAA T CTTTGCT AT TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATTCAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 2160 

45 GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 2280 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

SO 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 24 00 

ACTGACAGCG GCTCTTGCAG CTGCAACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 2460 
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ATOTAAAGAT TATAGCCAAG TAGCATATAA CGAACATTTA CATAGTAAAT TTAATGCCAC 2580 

TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG GAAGTTATTT 2640 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG GCTCATTATT 2700 

TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG TAGATTTAAA 2760 

AATTACTAAT CAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG CGCTAATGAG 2820 

10 

CATTCAATTA ATCGAAATTT ATAAACTTGC TATACCTCTT ATTATTATCG TTTTAGTTCA 2880 

AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG GAAAAGATTA 2 940 

75 TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc GCCAAATGCC 3000 

ATGGCAAATT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC ATATTTAGTT 3060 

GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT TATGGGATTC 3120 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT CGCCTCcTcT 3180 

TTTATTTATC CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA TATTTCCGGA 3240 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA CATCGTCATT 3300 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTGC AATGCCATTA 3360 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC AAGTGCTAAA 3420 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA AATTTCGCTT 34 80 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA TGATAAGTAA 3540 

TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT ACCACCTTTG 3600 

35 CCACTAACAT CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA TTCTTTGCTC 3660 

ATACCAAATC CAACTTCTTT TATAATGACT GGAACAGACA CTCGTGATAC AATCGACGCT 3720 

ATATlATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA TTCTTGAGGA 378 0 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC TCAAGTAATT CAACTGCTTC CAAAGCCTTT 3840 

TCTACTGGTA CGTCCGCACC AACATTGCTA AAAATCATGC CTTCAGGATT CATTTTTCGC 3900 

GCAATCGTAA ACGTCTCAGC CATGCGTGGA TTTCTCAATG CCGCATGTGT TGATCCAACT 3960 

45 

GCCATCGCTA AGCCAGTTTC TCTTGCAACT ACAGCTAGCT TTTCATTGAT GTTTTTCGTC 4020 

CACTCGCTAC CACCCGTCAT TGCATTAATA TAAACCGGAT ATGCCATCGT TAAGTCAGGC 4080 

GTCTGTGATG TCAAATCGAT ATCATTTACA TTAATTGATG GGATAGAATG ATGCACAAAA 414 0 

SO 

CGCATCTTAT CAAAATCTGA ATGCATTGCG TCAGATTGGG CCATTGCTAT TTCAACATGT 4200 

TCATTTTTTC TCTGTTCTCT TTGAAAATCA CTCATGATTA AACCTACCTT TTCGTCATTT 4260 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4 380 

TT CAT AC CAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 4440 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 560 

ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4740 

15 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4800 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

20 TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 5040 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 5280 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 5400 

ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG . CCACGGTCTT TCTTTAAAAG 5460 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACT AG CAT CA ACAAATGAAG 5640 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTAC7AG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 58 80 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 5940 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 6240 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6300 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 6360 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 6420 

10 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 648 0 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 6540 

15 ATCCTTCTAT TGTGCCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 6720 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6760 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 6840 

GaTCATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6900 

25 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6960 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 7080 

30 

CTAAATTTGC TTTAACATTT AATGTTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 7140 

CTCTTTTATA AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 7200 

GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 7260 

35 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 7320 

GTAXATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 7380 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 7440 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 7500 

TCAACAACCG AAT ATT CATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7S60 

45 

CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 7620 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNES5 : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GTCATtACCG amTTTCtTAG AaTCATTTAA AGATGATAAA TATACAAACG TTGGTAATTT 60 

5 AAAAGAAGTG AATTTTGATA AAATTGCTGC GACGAAACCC GAAGTAATCT TTATCTCTGG 120 

ACGTACAGCT AATCAAAAGA ATTTAGATGA ATTCAAAAAA GCTGCACCTA AAGCGAAAAT 180 

TGTTTATGTT GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA ACACTGAAAA 24 0 

10 

TATCGGAAAA ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG ATTTAGATAA 30 0 

CAAAATTGCT TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA TGTATTTACT 360 

AGTTAACGAA GGTGAATTAT CAACATTTGG ACCTAAAGGT CGTTTTGGTG GATTAGTTTA 420 

15 

CGATACATTA GGATTCAATG CAGTTGATAA AAAAGTAAGT AATAGCAATC ATGGACAAAA 480 

TGTTTCTAAC GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA TGGATAGAGG 54 0 

2Q TCAAG CGAT A AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG TATTAAAAAA 600 

TGTTAAAGCA ATTAAAGAAG ACAAAGTATA TAATTTAGAT CCTAAATTAT GGTACTTTGC 660 

AGCTGGATCA ACTACAACTA CAATTAAACA AATTGAGGAA CTTGATAAAG TTGTAAAATA 720 

25 ATTTTAAAAG AGGGGAACAA TGGTTAAAGG TCTTAATCAT TGCTCCCCTC TTTTCTTTAA 780 

AAAAGGAAAT CTGGGACGTC AATCAATGTC CTAGACTCTA AAATGTTCTG TTGTCAGTCG 840 

TTGGTTGAAT GAACATGTAC TTGTAACAAG TTCATTTCAA TACTAGTGGG CTCCAAACAT 900 

30 

AGAGAAATTT GATTTTCAAT TTCTACTGAC AATGCAAGTT GGCGGGGCCC AAACATAGAG 960 

AATTTCAAAA AGGAATTCTA CAGAAGTGGT GCTTTATCAT GTCTGACCCA CTCCCTATAA 1020 

TGTTTTGACT ATGTTGTTTA AATTTCAAAA TAAATATGAT AGTGATATTT ACAGCGATTG 108 0 

35 

TTAAACCGAG ATTGGCAATT TGGACAACGC TCTACCATCA TATATTCATT GATTGTTAAT . 1140 

TCGTSTTTGC ATACACCGCA TAAGATTGCT TTTTCGTTAA ATGAAGGCTC AGACCAACGC 1200 

TTAATGGCGT GCTTTTCAAA CTCATTATGG CACTTATAGC ATGGATAGTA TTTATTACAA 1260 

40 

CATTTAAATT TAATAGCAAT AATATCTTCT TCGGTAAAAT AATGGCGACA scgTGTTTCA 1320 

GTATCGATTA ATGAACCATA AACTTTAGGC ATAGACAAAG CTCCTTAACT TACGATTCCT 1380 

45 TTGGATGTTC ACCAATAATG CGAACTTCAC GATTTAATTC AATGCCAAAT TTTTCTTTGA 1440 

CGGTCTTTTG TACATAATGA ATAAGGTTTT CATAATCTGT AGCAGTTCCA TTGTCTACAT 1500 

TTACCATAAA ACCAGCGTGT TTGGTTGAAA CTTCAACGCC GCCAATACGG TGACCTTGCA 1550 

50 AATTAGAATC TTGTATCAAT TTACCTGCAA AATGACCAGG CGGTCTTTGG AATACACTAC 1620 

CACATGAAGG ATACTCTAAA GGTTGTTTAG ATTCTCTACG TTCTGTTAAA TCATCCATTT 1680 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC I860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 1920 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 2040 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TAGATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 2340 

CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 24 00 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 2460 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

25 TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 2640 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 2700 

CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2880 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2940 

AAAGTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3000 

■ 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3060 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAA7T TAAAGATGTT GAGCAAAAAC 3180 

45 AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 324 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 33 00 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 3360 

CAATGAAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 3420 

CGATATATCA ATTAAAAAAA TTCA7AGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 3480 
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AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3 600 

AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 3660 

GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CGCAATAGAA ATGTATTACA 3780 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTTATCAm 3840 

AAGTCGATGT AGATAAGTTG CCGAAAAAGG GTATAGATAT AACTCACGGC GATAAAGCCT 3 900 

TTGAAAAAAA GCTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3960 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4020 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4080 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA 4140 

20 ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4200 

ATCAATTTAA TAAATTTTTA TATGAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 4260 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CGTTAAACAT GAATCACCTC 4320 

25 AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AGACCACGGT GATATCAATG 4 380 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 4440 

CACTACTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 4500 

TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTTAAGTTAG TACTTTTTTA 4560 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4620 

CCAATAAGAA AATTTAAACA TGATTTGTAA GTTAGTTTAA TAGGAAATAT ATGCTAAACC 4680 

AAAAGAAGCA TATTGTTATT TACTGGAATA ATTAATAATC . ATGTCATGTT AAATGTTAGC 474 0 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4800 

CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG T ATT AG TAG C CATGGATGAG 4 86 0 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4 980 

45 GATTCTGTAT TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TGATGCAGAT 504 0 

ATGAATGAAG TTGAAGGTGT TTACGGACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5X00 

AATTTATTTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

ATGATTAAAC ATGCATTAGA TAACGACGCA AAACATGTTG TAATTTCACT AGGTGGGATT 5220 

GATAGTTTTG ATGCTGGTGC AGGTATGTTA CAAGCATTAG GTGGTCAATT CTATGATGAC 5280 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 5400 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 54 60 

5 AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 5 5 80 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 5640 

10 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 5700 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

15 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 594 0 

20 TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCGAAAG TATACGTAAA AAATATGAAA 6060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 6 240 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 6300 

30 CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 6360 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 64 2 0 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 6480 

35 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 654 0 

ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6660 

40 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6780 

45 CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6840 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6960 

50 C7AGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 7020 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 7080 
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GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 7260 

AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 73 80 

ACCCCTXAAA CTCTATTATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7500 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAATCTA 7740 

CCGTTAATTG GTAAATCGTC CATGCAATGG CACCCACAAA TCCACATGCT ACTAAGAGGC 7800 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

25 GACACCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7980 

AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8040 

TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 8220 

CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 8280 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 83 40 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 8400 

TAAAGCGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 84 6 0 

TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

45 TTCaAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CATCTATAAT 8640 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 3700 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

SO CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 88 30 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA ACACCGATAG ACACTGACAA 9000 

TTTAATAACT TCTTTGTTTG GTAAATGGAA TGATGATTTT TCAACACCCG AACGAATATT 9060 

TTCAGCTAAT TTAACACTTT GATCAAGTGA ATAATTGTGA ATGACAACTG AGAACTCTTC 9120 

GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG TTTTTAAGTA ATTGAGACAT 9180 

TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA TCATTGaCAT CTTTAAATCC 9240 

ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT TTTTCAGCTT TTCGTGAAAT 9300 

TTCATTTAAA TGTCTATCAA ATTCTTTTAC ATTACCTAAG CCTGTTAAGT AATCATATTT 9360 

ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC CAAATATCGA CAAATGTTAT 9420 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA ATGATAACTT CCGATAGTGT 9480 

GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT ATTGTAACAA CATTAAGTAT 954 0 

2Q TAATAATGAT AGCACATCAT TTTGTTTTAA AAATGGTCCA ATAGCACTTG TTACTGCAGC 9600 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA AATACTACAA TTTCAACAAT 9660 

TGCTACAATT ACTGTGGCAG ATAATGTATA GACCATATTT GTAAATCTAC CTAAAAACAA 9720 

25 TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA TAAGGGATAG GGTAGACAGA 9780 

TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA GCCTTAGAAA AAAC 9834 
{2) INFORMATION FOR SEQ ID NO: 38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



T(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

* 

TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

45 GAGCTTGAAG CGGTAAGTAT TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 24 0 

AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 300 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGGCATATT AAAAGCTCAA 360 

50 GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 420 

TTTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 430 
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TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 600 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 660 

GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 720 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 780 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 84 0 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 90 0 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 960 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 114 0 

ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

25 GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 13 80 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 1440 

TCTAAGTA7T CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

30 TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 1740 

TAAC&GCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 1860 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 1930 

45 TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 2040 

GTGGTATTA7 TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGGCATTTA 2160 

50 TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 2280 
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ATGCGACAGG TACAGCTACA CCGATTGCAG 
CGACGACAAT TGTGATTTAT GGTGTAGTAA 

5 TTGGTTCAAT TGTATTTAAA AAATATCCAA 

GTGCAGTAGA CGCATAGCAT CATCATATTG 
TGATTCAGTC GATGTAACAG TCGATAATGA 

10 ATAAAAATGT CATAATTTAT TGTCGACAAA 

TCAAGTACCT TTTACACGAT GGAATGAACT 
AACAAATATC ATTGATATAA CGGTAAATGT 

15 

ACGATATTTT TGTAAATTCA CTGATTCAAG 
TGAGCGATTC TGAGAAAGAA ATTTTAAAAA 
GTGAACTTGC TGAGGCAATT GGATTATCTA 

20 

TAATACAAAA GGAATATGTT ATGGGAAAGG 
TTTGTATTGG CGCAGCGAAT GTAGATCGTA 
AAACATCAAA TCCTGTAACG TCAACACGCT 

25 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG 
AATGGGAAAT GATTAAACGA TTGTCCACAC 

30 TTGAAAATGC GAGTACAGGT TCATATACAG 

ATGGCTTaGC AGATATGGAA GTGTTTGACT 
CACACTTATT GAAAAAGGCT AAGTGCATTA 

35 TAAACTTCTT ATGTGCCTAT ACCACGAAAC 

CTTCCCCAAA AATGAAAAAT ATGCCTGATT 
ATAAAGATGA AACAGAAACA TACTTAAATT 

40 

TAGCTGCTAA ACGCTGGAAT GATTTAGGTG 
AAGAACTCAT TTATCGAAGT GGTGAGGAAG 
GTGTGAAAGA TGTTACAGGT GCAGGCGATT 

45 

TAAATGGGAT GTCTACTGAA GATATATTAA 
TAGAAACGAA ATATACAGTT AGGCAAAACC 
AGGATTATAA AAATGGCAAA TTTACAAAAG 

SO 

GCACGGGAGA ACAATCAACC GATTGTAGCA 
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GATTTTTAGT TATGTTTGGA TTTAATCATC 2400 

TGGCGATTGT AGGTGCGCTT GCAGGTTATC 2460 

TTGTTACTAA GCAAGACATG ATTAATCGAG 2520 

AATAGTAAAA ACAAATAAAA CATAGTAACG 2580 

GTCACGTTTT TTTATAGAAA AATACAAGAC 264 0 

TATCATACTG TATAAACATT TATCATTTTC 2700 

TACTTTTTAC GAAATTATGC GTATTTTATA 2760 

AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

TATTTTAAGT CAATATGAGG AGGGATGTTA -2880 

GAATTAAAGA TAATCCGTTT ATTTCACAAC 2940 

GACCCAGCGT AGCAAACATT ATTTCAGGAT 3 000 

CATATGTTTT AAATGAAGAT TATCCTATTG 3 060 

AGTTTTATGT GCATAAAAAT TTAGTTGCAG 3120 

CTATTGGTGG CGTAgCAAGA AATATTGCTG 3180 

CTTTTTTATC TGCTAGTGGA CAAGATAGTG 3240 

CATTTATGAA TTTGGATCAT GTTCAACAAT 3300 

CTTTAATTAG TAAAGAAGGC GACATGACAT 3360 

ACATTACGCC TGAATTTTTA ATTAAGCGTT 3420 

TTGTAGATTT GAATTTAGGC AAAGAGGCAT 34 80 

ATCAAATCAA ATTAGTTATC ACCACGGTTT 3540 

CATTACATGC TATTGATTGG ATTATCACGA 3600 

TAAAAATAGA ATCTACTGAT GATTTAAAAA 3 5 60 

TTAAAAATGT TATTGTGACA AATGGCGTGA 3720 

AAATCATTAA GTCAGTTATG CCATCAAATA 3 780 

CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3840 

TTGCTGGTAT GGTTAACGCA AAGAAAACGA 3900 

TAGATCAACA GCAACTTTAT CACGATATGG 39 60 

> 

TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4 020 

TTAGAATCAA CAATTATTTC GCATGGTATG 4080 
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GCCATTCCAG CAACCATAGC CATTATAGAT 
GATTTAGAAA TACTGGCAAC TAGTAAAGAC 
5 GAAGTTATTG CGATGAAGTG TGTTGGTGCT 

GCAATGGCTG GTATTCAATT TTTTGTTACA 
GAACATACGA TGGACATTTC AGCAGACTTA 

10 

ATCTGTGCAG GTGCCAAATC AATTTTAGAC 
AAAGGCGTTC CAGTTATTGG ATATCAAACG 
AGCGGTGTTA AGTTAACAAG TTCGGTTGAA 

15 

ACAAAACAGC AGTTAAATCT TGAAGGTGGC 
CATGCCTTAT CAAAAGCATA TATTGAGGCA 

2Q AATCAAGGTA TTAAAGGTAA GGACGCCACA 

ACGAATGGTA AAAGTTTAGC AGCAAATATA 
GCTAAAATTG CTGTCGCTGT TAATAAATTA 

25 CGCTATCACA GGGATAGCAT TTGCACTATT 

AAAAATAGAC TTCAAAAAGA CGTTAATAAT 
TATGATGAAC ACAACGATTG GTTTGACAAT 

30 GCTAATAAAT ATTAGTAAAG CAGGCATAAA 

TGGCTTTACG TTCTTTTTAA ACGTATTACT 
CATCTTTAAT TATATTAAGG TATTACCATT 

35 

TAAAATAACT AGAATGGGGC GCTTAGAAAG 
GCAACCAGAA GTATATTTAA CAATAAAAGA 
ATATACAATT GCGACGTCTG GTATGAGTGC 

40 

GCAGATGATT GAACCCAAGT TCGTAGTTAC 
TA7CATCGCC AGTGTAATCA ATCCCTATAA 

45 CTTAACGAAA TCCACAGAAA CTAAAACATT 

TGCCTTTTTC CAAATGATTG GTGATAGTGC 
AGCCGTAATG TTGTTAGCAT TTATTTCATT 

SO 7GTTGGTTTG AACTTTAAAC AGCTTATTGG 

GGGGATTCCA TGGAGCGAAC TGTTCCAGCT 

55 



GGCAAAATTA AAATTGGTTT AGAAAGCGAA 


4200 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 


GAAGAACTGT 


CTAAAACAAA 


TGTCACTGTT 


4440 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


ACGCCAGAAC 


GACTTGCTGA 


CATTCATTTA 


4620 


ATTGTTGTTG 


CTAATCCAAT 


TCCATATGAG 


4680 


ATCATAAATG AAGCTGTTGT 


TGAAGCGGAA 


4740 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4800 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 


TTGTAGGTGA TGATACATGA ATATTTTATT 


4920 


TGTTGCGTTT 


TTATTCAGTT 


TTGATCGTAA 


4980 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 


TTTAACTGCA 


CTAGGTTCAT 


TTTTTGAAGG 


5100 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 


GCCATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 


TATTATTCCA 


AGATTATCTA 


GAGCGAAATT 


5400 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 


AGCAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


GAATGGAAAA ACAGGAAAAC 


CTAAGAAAGT 


5640 


GATGGATGGG 


TTTAAAATCG 


CTGTTGTAGT 


5700 


AATGGAAGCA 


ATTAATATCA 


TGTTTGGTAG 


5760 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 


GGCTCTTTAA 


TGGCGACTAA 


ATTAATTACA 


5880 
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CAAGGTATCA TTTCAGTTTA CTTAGTAAGc 
GTAGGTTCAA TTAAAGGCAT TAGTGATAAA 

5 

AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA 
TTGTAAATGA ATCGAAGTAC CTAAATTAAA 
TTGGCGCAAC AGCGcATgcA TAACTTAGTG 

10 

CGTAGCCGTT TTTGAAATGT ATGTTGATGG 
TATATCTTTT TTATGTTTTG AAGGGACAAT 
AACTGTAAAT GAAATGGTAG ATACATAGAA 

15 

GACATAAGCA AAGATGATAC CCAATATTAA 
TTTAGTGAAA ATACCAACTG CAAATACACC 

20 AAACAAGAAT AAATCCCATA AGTCATTTGA 

AAAACCGAAA ATACCTGCAA TGATAATAAT 
GCTACCTTTT CCGAAGAAGC GTTGCTTAAT 

23 TAAACTAGAT GAAATGGTAG ACTGTGCAGC 

TACAAATGGT GGCATCTCAG TCAAAATGAA 
TTTTGGTAAA ACAGCTTCAT GTGTATAAAA 

30 

TAAGGGTGCT GAAATTAAAG CTAGGATACC 
TAAACTATCA GAAGCTTGAT AACGCTGCAC 
GTTGTTGAAA ATATTTCCTA GGAAAATAAT 

35 

CCAATTGTCT G CACT AATT A ATTTTTTGTG 
ACCGCCTTTA ATGTTCACAA CACCTAGAAT 
GACGCCTTGA ATGAAATCAC TCCAAACCAC 

40 

AATACATAGT AAACCAACGA GTGATGCAAC 
GATTGCTAAT GTTGGTAAGT AGATAACAAT 

45 TAATAATGAG CCAATGACAC GTATGCTAGG 

AGATGTTACC TTTAACTTTT TAAAGAAAGG 
TGCGACGATA GCAATGTTAC CAGCGATATA 

50 TGTCGACATA AATGTAATCG CACTTAACGT 

AGATGGCAAG CGACCACTTG CGGTAAAGAA 

55 



TTCGCTAATT 


TTGGTACGGT 


TGGTATCATC 


6000 


CAAGGAGAAA 


AAGTTGCATC 


CTTTGCAATG 


6060 


ATCATTTCAG 


GATCAATCAT 


TGG CTTAGTA 


6120 


TTCATGGCAA 


AG CTAAACCC 


CGTCACCAAG 


6180 


ACGGGGTTTT 


ATCATAACAA 


TCTACTTTTT 


6240 


TTTATCTTTT 


TCAAAAATTG 


> 

TTAATCCCGT 


6300 


GAAGCTAAGT 


ATATAAGCAA 


AGACAAAAGC 


6360 


AGGTGAGTTA 


CCTTTGCCAA 


CACCATTATA 


6420 


TCCACAAATA 


ACACCGAATG 


TATTCGTACG 


6480 


AGCCAATGGA ACGCCGAATA ATCCAGTCAC 


6540 


ATTAGAAGCA 


ATTAAGTATA 


GTGACATTCC 


6600 


GAAACGTGCA AAGTTAACTT 


CGTGTCGCTC 


6660 


GTCGATTGAA ATACAAGCAG ATATAGAATT 


6720 


GGCGAAAATG 


GCTGCAATAA 


GTAATCCTGC 


6780 


ATATGGCACT 


ACAGATGATG 


TATTGAAGCC 


6340 


TGAATACAGC 


ATTGTACCCA 


TACCATAAAA 


6900 


ATTTGTCCAT 


AACGATTTAT 


TTGTTTC1-1T 


6960 


GACGTCTTGA 


CTCGCTGTGT 


ATTGATACAA 


7020 


TGGAATGGCA 


GCTGCCGCAG 


TATTTAGTTT 


7080 


CTCAATCGCA 


TCTGCAAAGA 


CAGTGCCGAA 


7140 


AATAATAACT 


AAAGCGCCGC 


CTAATAAAAT 


7200 


ACCTTCGAAA 


CCACCTAAAA ATGTATATAA 


7260 


GATATAAGGG 


TTCATGTCTG 


ATACAGATGT 


7320 


TGCAACACGC 


CCTAAATGGT 


AAACGACAAA 


7330 


GCCAAATCTA 


GCTTCTAAAT 


ATTCATATGC 


7440 


GACATAGAAA 


TAAATAAGTA 


ATGGAATAAT 


7500 


TGACCAATCT 


GTTAAAAATG 


CTTTCTCTGG 


7560 


AGTAGCATAA 


ATTGAAAAGC 


CAACTACCCA 


7620 


ACTATTGGTA 


CTTTGGCTCG 


CGCGCTTGGT 


7630 
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TGTGCCAAAT CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 7800 

TGTCTATAAA TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 7360 

AGGTTTGAAA GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 7920 

CAATGTTGGA TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 7980 

CAGTTGGTAA GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 8040 

GCGACCATTA ACGTTATATG TAGAACCAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 8100 

AACTAACATT TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 8160 

TTCGAGTAGG AAGAAGTTTG GCGCTGTATA T7TAACACCA ACAATTTTTT CATGATTAAA 8220 

TAGCTCGCTG AATTGTTCAA TAGAAATATT CACACCTGTT AAATCTGGTA TTGCATAAAT 8280 

AATCATATTG TTCTGAGTTG CTTCGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 8340 

AGTAAATGGA TAGTAGAATG GTGTTACGGC AGAAAGTGCA TCATAACCGA GTTCTGTGGC 8400 

ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAACGAA CCTACTTGAG CAATCAATTT 84 60 

CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 8520 

25 TAATAAAAAG TTTTCGCCTG AGCTACCATT TACATAAAGA CCGTCTAATT CTTCAGTTTC 8580 

AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 8640 

AGGAACGAGT AACGCTGCAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 8700 

30 AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 8760 

ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 8320 

CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 8380 

TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 8940 

GGACAACAAA AGTGAGCTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 9000 

GCAATATCAG TTGATCCAAC CTGTCATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 9060 

AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 9120 

GCGATTATTA AAATCACTGT CTCCTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 9180 

ACTAGGCGAA TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 9240 

TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 9300 

TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 9360 

SO ACGTGCTGCA ACGAGTGCAT TGAAAAAGCG CATGATTGCC GGAGGATTTA CGAGAAGCAC 9420 

ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT GATATTGCAA AACAAATATT 9480 
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AGGGCTTATA TTAATTGGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 9600 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 9660 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 9720 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 9780 

TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA CAGTGTGCCA CGGGCAAATT 9840 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 

ATTAAAAAAG CACCTAGCAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 9960 

GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 10080 

GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 10140 

AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 10320 

TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA TCATACGATA ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 10440 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 10500 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 10680 

AAGGCATCGC TATAACCATT TTGTCTTATA TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 10800 

AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT 10860 

ATCTTAATGA TATATTGTAA ATGACTTTAC GTGAAAAAAC GACTTATGGA GTGAGGAATA 10920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10980 

TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTTAGCAATT AAAGAAACGG TAGATTTACC AGTTATTGGC 11 ICO 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTG CAACGTC AAAAGAAGTT 11160 

GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAG CAACGT 11220 

CCGAAAGAAA CGTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 11280 
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TATATTGGCA CGACGTTACA TGGCTATACT AGTTATACGC AAGGACAATT ACTTTATCAA 11400 

AATGACTTCC AATTTTTAAA AGATGTACTA CAAAGTGTTG ATGCAAAAGT TATTGCGGAA 114 60 

♦ 

■ 

GGTAATGTCA TTACACCGGA TATGTATAAA CGTGTGATGG ACTTAGGCGT TCATTGTTCA 11520 

GTCGTTGGTG GTGCGATAAC ACGACCAAAA GAAATTACGA AACGTTTTGT TCAAATTATG 11580 

GAAGATTAAA TGATAACGAT AAAAAAACGA GATGACCATC ATTAATTAAA GGCACCTAAT 11640 

TATCTTAGGT GGCTGAATGA ATGTAATGGG TTCATCTCGT TTTGTTTGTT TATGATAGTG 11700 

ATTTTATTTT CAACTTTATC CAAAAATAAG TAAAGCGACG GGGATGGTGA TTAATAGCGA 11760 

CAACGCCACG CGTAAAAACC AAATGATGAT GAGTTTCCAG ACAGGTATTT TAATTTCAGT 11820 

TGCTAGTATA CATGGCACTA ATGCTGAGAA AAAGATAATG GCTGATACGC TTACTACACC 11880 

GACGACAAAT TTAGTACTCA TTGCAGCTTT AGTTACTAAC AAAGATGGTA GAAACATCTC 11940 

TACAATAGAA AckCTGACGC TTTTGCTAGT AAAGCCTGAT CAGCAATTGG GAAAATATAA 12000 

ATAAATGGAT AGAAGATATA GCCAAGCCAA TCAATGAATG GTGTATAGTT CGCTACAATC 12060 

AGTCCTAAAA AACCAATCGA TAATATAGAA GGTAAAATAC CAACAGTCAT TTCTAAACCG 12120 

TCTTTCAAAT TGTCCCAAAC GTTCTTCACG AGAGATGGTG TTAATGCATT TTGTTTCATC 12180 

GCCTCTGCAT ATG CAGTTTT CAGTCTGCTT CCTTCAATAG CAACTTCTTG TTCTCCTTCT 12240 

TGTCCGTTAT AATATTCTGT TGATTCATTG CTGATTGGCG GTAGCCATGC AGTAATTGCA 123 00 

GTCACGACAA ATGTGATGAC TAAAGTTATC CAAAAGTATA AATTCCAATG CGGCATTAAT 12360 

CCTAAAGTTT TAGCAACGAT AATCATAAAA GTTGCTGAAA CTGTTGAAAA GCCAGTCGCA 12420 

ATAATCGTGG CTTCTCGTTT GTTGTACATC CCTTGCTTAT AGACACGATT AGTAATCAAT 12480 

AATCCTAAGG AATAACTGCC GACAAACGAA GCCACTGCAT CGACAGCGGA TTTTCCTGGT 12540 

GTTTTAAAAA TAGGTCTCAT AATAGGCTCC ATATAAACAC CGACAAATTC TAATAAGCCA 12600 

TAGCCCACTA ATAAAGAAAG cGcAATTGCA CCTACTGGAA TTAAGATACT TAATGGCATC 12660 

ATTAATTTTT CAAACAAAAA CGGACCATAG TTAGCTTTAA ATAGTATTGA TGGACCGATT 12720 

TTAAATACAT ACATTATACC GATCATTGCA CCTGCAACTT TAAATAATGT AATGACCAAG 12780 

TTTGTGATTG AAGTCATAAA AGTACGTCTC ACTATTGGTA ACGCTGTACC AATTAAAATC 12840 

ATAATCAGTG CAACATAGGG CATAAGTGGA CCTATGATTG AGCGAATGGC TAGATGAACA 12900 

TGATCGACGA AAATAGTGTT GTTACCATTA ATCGTAAAAG GAATAAAGAA ACATAGTATG 12960 

CCCACTAAAC TATAGACAAA AAAACGCCAT GCACTTGGTT GTTGTGCATT AGAATGATAT 13020 

TGATTCATTA AAGCAACCCC TTTGTTTAAA TGAATACACA AAACTGTATG ATGCATCTTC 13080 
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ATAGTTTGAA TTATTTTCAT ACCAATACAA ATTAACTAAT TATATATAGA TTGAAACTAT 13200 

ATTACTTAAT AAAATATTTA TCTTAAATGT TGTTGTGTTG ATTCAACACC ACAACTAAAA 13260 

G7GTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 13320 

GTATTATTAA ATTTTTATTA ATTTTGTAGT CTTAATCmAA AAATAATATA TGTCATGTTA 133 80 

TATTGAAGGT GCAGTTGTTT TTCATTCTCA AGAGGGGGTC AAAAAAATAC TTTTGAGGTG 13440 

ATTATATGTT AAGAGGACAA GAAGAAAGAA AGTATAGTAT TAGAAAGTAT TCAATAGGCG 13500 

TGGTGTCAGT GTTAGCGGCT ACAATGTTTG TTGTGTCATC ACATGAAGCA CAAGCCTCGG 13560 

AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 13620 

GGAATGCGAT AACGTCACAT CAAATGCAGT CAGGAAAGCA ATTAGACGAT ATGCATAAAG 13680 

AGAATGGTAA AAGTGGAACA GTGACAGAAG GTAAAGATAC GCTTCAATCA TCGAAGCATC 13740 

AATCAACACA AAATAGTAAA ACAATCAGAA CGCAAAATGA TAATCAAGTA AAGCAAGATT 13800 

CTGAACGACA AGGTTCTAAA CAGTCACACC AAAATAATGC GACTAATAAT ACTGAACGTC 13860 

AAAATGATCA GGTTCAAAAT ACCCATCATG CTGAACGTAA TGGATCACAA TCGACAACGT 13920 

CACAATCGAA TGATGTTGAT AAATCACAAC CATCCATTCC GGCACAAAAG GTAATACCCA 13980 

ATCATGATAA AGCAGCACCA ACTTCAACTA CACCCCCGTC TAATGATAAA ACTGCACCTA 14040 

AATCAACAAA AGCACAAGAT GCAACCACGG ACAAACATCC AAATCAACAA GATACACATC 14100 

AACCTGCGCA TCAAATCATA GATGCAAAGC AAGATGATAC TGTTCGCCAA AGTGAACAGA 14160 

AACCACAAGT TGGCGATTTA AGTAAACATA TCGATGGTCA AAATTCCCCA GAGAAACCGA 14220 

CAGATAAAAA TACTGATaAT AAACAACTAA TCAAAGATGC GCTTCAAGCG CCTAAAACAC 14280 

GTTCGACTAC AAATGCAGCA GCAGATGCTA AAAAGGTTCG ACCACTTAAA GCGAATCAAG 14340 

TACAACCACT TAACAAATAT CCAGTTGTTT TTGTACATGG ATTTTTAGGA TTAGTAGGCG 14400 

ATAATGCACC TGCTTTATAT CCAAATTATT GGGGTGGAAA TAAATTTAAA GTTATCGAAG 14460 

AATTGAGAAA GCAAGGCTAT AATGTACATC AAGCAAGTGT AAGTGCATTT GGTAGTAACT 14520 

ATGATCGCGC TGTAGAACTT TATTATTACA TTAAAGGTGG TCGCGTAGAT TATGGCGCAG 14580 

CACATGCAGC TAAATACGGA CATGAGCGCT ATGGTAAGAC TTATAAAGGA ATCATGCCTA 14640 

ATTGGGAACC TGGTAAAAAG GTACATCTTG TAGGGCATAG TATGGGTGGT CAAACAATTC 14700 

GTTTAATGGA AGAGTTTTTA AGAAATGGTA ACAAAGAAGA AATTGCCTAT CATAAAGCGC 14 760 

ATGGTGGAGA AATATCACCA TTATTCACTG GTGGTCATAA CAATATGGTT GCATCAATCA 14 820 

CAACATTAGC AACACCACAT AATGGTTCAC AAGCAGCTGA TAAGTTTGGA AATACAGAAG 14 880 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AGTAGCATCA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16080 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 162 60 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 163 20 

AATCGCATCG AATGAtGCgn AGrxACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 163 80 

ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16 520 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA ACAATATCTT TAAAAGCAGC ATGTGGAATG GCTAAATCTT CTAAATCTGC 16800 

CATAGAAAAT TCAAGATTGA TATCATGTGG TCGCTGTTCA GCAAGTTTAT GCACAAAGTC 16860 

5 

AGGTTCTGTG ACAAAAGGCG AAGACATGCC GACCA7ATCT GCATGTTGTA AAGCATCTAA 16920 

AGCAGACTCT GGAGAATTAA TCCCGCCACT TGCAAITAAA GGGATACGAC CTGCTAAATG 16980 

TTCATAGACA ATTTGGTTAA CTGGTCGACC GAAATGATCA CCTGGTGTAC GAGACGTATT 17040 

10 

TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACGTCCAT 17100 

GACCCAATTG ATTAATTGGT TGAACTCGTC AATGGTATAT CCTAAATCAC TGCCTCTGGT 17160 

TTCTTCTGGC GTTGCTCGAA ATCCTAAAAT AAAATTGTCA GGTGCTTCTT TATCAATCAC 17220 

75 

TTCTTGTACC GCACGCATAA CTTCTAAACA TAATCTTGCA CGATTTTTTA ATGAGTCGGC 172 80 

ACCGTAATGG TCTGTACGTT TATTCGAAAA AGTTGAGAAA AATGTTTGAA TCAGCAAACG 17340 

20 TTGTGCAATC GAAATTTCCA CACCATCAAA ACCTGCTTTA ATCGCGCGTA ATGTAGCATC 17400 

GCGATACTGC TGAATGATGC TATTGATTTT CTCATGAGAC ATGGCGATAA CATCGTGTTC 17460 

AATCGGTGAA TGCAATGTCA TAGGGCTTGG TCCATACACC TTTCCAAAAT TTAAAATGGC 17520 

25 TTGATTTGAA AAACGACCAG CATGCGCTAg CTGGATAATA GCGAGGCTAC CATGTTGTT7 17580 

CATCGTAGAT GCCATGTTAG TTAATCCAGG GATACAAGCA TCATGATCAA TATTAAAGCC 17640 

ATATTCAAAC AATTGACCAT AAGGTTCAAT GTAAGCAGCG CCGGTGACTT GCATTCCAGC 17700 

30 

TGAATTAGAG CGACGTGCAG CATAAGCCAA GTCTTCTTTT GTAATATAGC CTTCTTTTGT 17760 

TGATGTGTTT ACGGTCATTG GTGA7AATAC AAAGCGATTC GAAATTTTGA TGCCATTAGG 17820 

TAAGTGGATT GATTGTAAAA GTGGTTTGTA TCGGTACATA CTATGATTCC TTTTCTATTC 17B30 

35 

AATATTGTTT TCAAAGTACC ATGGAAAGAA TGAATAATCA ATGATGAACA GTCTTGATAG 17940 

AATAGAATTG GTACATGGAA AGTATTTTTA AAATTAAACT AATGAATGGC ATTTGTAGGT 18000 

» 

CTGAAAATAT GAATATGAAA AAGAAAAATA AAGGCGAAAA GATATAAAAG TTAATTGAAA 18060 

40 

AACGTTATCA TATACGTGGG TATATGAAGA GGGAATGGTA TTAAGAACGC TAAAATGTTA 18120 

TGTCGGTTTG ACATGACAGG ATAAGTTTGG AGATGACGGA TTGGTTAAAT TAAGCGTATT 18130 

45 AGACTATGCC TTAATAGATG AAGGTAAGGA TGCACAAAAG GCATTGCAAG ATTCAGTGAC 18240 

ACTTGCAAAA TTAGCAGATC GACTTGGCT7 TAAGCGAATT TGGTTTACGG AACATCATAA 18300 

TGTACCAGCG TTTGCGTGTA GTAGTCCAGA ACTTTTGATG ATGCATACAT TGGCGCAGAC 18360 

50 AAATCACATA CGAGTTGGCT CTGGTGGTGT GATGCTGCCG CACTATCGAC CTTATAAAAT 18420 

TGCTGAGCAT TTTAGAATGA TGGCAGCGTT ATATCCAAAT CGTATTGATT TAGGTATTGG 184 80 
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TAGTTACGAT GAATCGATTT CGTTATTACG 
TGCGCATACG TTAGGTGTCC AACCACACAT 
5 TAGTAGCGCA ACATCTGCCA AAATAGCTGC 

ATTTTTGCTA CCAGATATAA ATGCGATACA 
AAAACATTTC CAAGCATCAA CGATTAAAAT 

10 

CATTGTAGCT GATAACGAAG CGGAAGTAGC 
ATTAGGTAAA TTACAATTTG CAGAATTTGA 
GTATAAGCTT AATGATCGAG ACAAAGAGAT 

75 

AGGTACACAA GAAAAGGTTA AAGCACAATT 
TGAGGTGTTA GTAGCACCGC TTATTCCAGG 
ACTCGCGGAA ATTTATTTGT AGCATTTTAA 

20 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA 
ACCTGAATTG CAAGATGATA TTGGGACAGT 

25 AGTTAAAGTG GATGATGAAA TTGTGAGTAT 

AACGCCATTG TCAGGAACGA TTATTGAGCG 
TTTAAACTCT GAAAAACCAG AAGAAAATTG 

30 AGCATTCCTA GCATTACCGG AGGCTTAAAT 

TGAATATTTA ATCAATGATA TG CATC GAGA 
ATCTTCATTT GAAGATTTGT GGGAATTATA 

3 ° ACCTGTAAGT GATGAATATT TAGCTGTACA 

ACATGTTACG GATTTGAAGG ATTTGAAGCC 
AGGTGATATC ACGACGTTAA AAATCGATGC 

40 

AGGATGTATG CAAGCTAATC ATGACTGCAT 
TCAAGTTCGA CTTGATTGTG CAGAGATCAT 
TAAAGCCAAA ATAACACGTG GATATAATTT 

45 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA 
TCTTAGCTGT CTTAAATTGG CTGATCAACA 
50 ATCTACAGGT GTATTTGCTT TTCCTCAAGA 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC 

55 



P0 786 519 A2 

TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

TACAGCGAAG GATAACATTG ATATTTACAA 18780 

GGACGCAAAG GTGATGGCAT CTGTATTTGT 18840 

AGCATTACAA CATGCCTTAG ATGTTTGGTT 18900 

AGATTTTCCT TCAGTAGACA CAGCACAAAA 18 960 

GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

CGAAGCATCG AAAACGGTCA TTGATGTGCA 19380 

AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

GAGAAATGAC AATGACGTAT TGGTAATGCC 19620 

TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

AGATGCTATG TTAAGTGATT TGAATCGTCA 19740 

GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

GCCAGCAAAG TATATAATTC ATACGGT7GG 20040 

GATGAATCAG GACTTGTTAG CTAAATGTTA 20100 

TAGTTTAAAT CATGTCGCTT TTTGCTGTAT 20160 

TGAAGCAGCA GAAATTGCTG TTCGAACAGT 20220 

ATTGAAAGTC GTGTTCAATG TATTTACAGA 20280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 20580 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 20640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 20820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20880 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 210Q0 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTA7ATGA CGATGAATAA AAAGGCATAT 21120 

CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATTACAGCAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 2124 0 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 2130 0 

CGCTATGCAA GTTTATGTTA ACCAGCATAT- CTTTTTAGAT GAAGATATTT TATTCCCTTA 21360 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGTGATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 2178 0 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 2184 0 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 22200 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

5 TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 22380 

AAAGTATGAT ATATATATGG TTTTTGTTT C TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

10 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 22560 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22 620 

15 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 22680 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 22740 

2Q CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22 800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23 040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 23100 

30 TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 23160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 23220 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 23280 

35 ■ 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 23340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 234 00 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 23439 

40 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

SO (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 
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TATTATGCAG TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 

TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 24 0 

AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 300 

GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 3 60 

CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 4 20 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 480 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 54 0 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 600 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 660 

TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 720 

AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 780 

TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 840 

TAAATTAACT GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 

25 TTATAAATTA AATGCTGATC TATCACCGAA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 960 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 1020 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 1080 

30 AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 1140 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 1200 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 1260 

GGTAGTGAAA ATATGAGCTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 132 0 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 13 30 

m 

GACT'CATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGGAAA ATTCCGTTTT 1440 

CCTATGAGCG TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 1560 

CGACATCGCC GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 1680 

AAGCCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 174 0 

50 ACACATAATA TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 1800 

CACAAAGAAA GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 1860 
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AAACAAAAAT ATGATAAATA TGTAGCTAAG ACGCAAACGT CTCAAAATAA ACAATTAGAA 1980 

CAAGAAAAAC AAAATGATAG TGTTGTCAAA CAAGGAACTG CATCTAAATC ATCTGATGAA 204 0 

AATGTATCAT CAACAACAAA ATCAATGCCT AATTATTCAA AAGTTGATAA TACTATCAAA 2100 

ATTGAAAATA TTTATGCTTC ACAAATTGTT GAAGAAATTA GACGTGAACG AGAACGTAAA 2160 

GTGCTTCAAA AGCGTCGATT TAAAAAAGCG TTGCAACAAA AGCGTGAAGA ACATAAAAAC 2220 

GAAGAGCAAG ATGCAATACA ACGTGCAATT GATGAAATGT ATGCTAAACA AGcGGAACgC 2280 

TATGTTGGTG ATAGTTCATT AAATGATGAT AGTGACTTAA CAGATAATAG TACAGATGCT 2340 

AGTCAGCTTC ATACAAATGG CATAGAGAAT GAAACTGTAT CAAATGATGA AAATAAACAA 2400 

GCGTCAATAC AAAATGAAGA CACTAATGAC ACTCATGTAG ATGAAAGTCC ATACAATTAT 2460 

GAGGAAGTTA GTTTGAaTCA AGTATCGACA ACAAAACAAT TGTCAGATGA TGAAGTTACG 2520 

GTTTCGAATG TAACGTCTCA ACATCAATCA GCACTACAAC ATAACGTTGA AGTAAATGAT 2580 

AAAGATGAAC TAAAAAATCA ATCCAGATTA ATTGCTGATT CAGAAGAAGA TGGAGCAACG 264 0 

aATAAAGAAG AATATTCAGk AAGTCAAATC GATGATGCAG AATTTTATGA ATTAAATGAT 2700 

ACAGAAGTAG ATGAGGATAC TACTTCAAAT ATCGAAGATA ATACCAATAG AAACGCGTCT 2760 

GAAATGCATG TAGACGCTCC TAAAACGCAA GAGTACGCAG TAACTGAATC TCAAGTAAAT 2820 

AATATCGATA AAACGGTTGA TAATGAAATT GAATTAGCAC CGCGTCATAA AAAAGATGAC 28 80 

CAAACAAACT TAAGTGTCAA CTCATTGAAA ACGAATGATG TGAATGATAA TCATGTTGTG 2940 

GAAGATTCAA GCATGAATGA AATAGAAAAG AATAACGCAG AAATTACAGA AAATGTGCAA 3000 

AACGAAGCAG CTGAAAGTGA ACAAAATGTC GAAGAGAAAA CTATTGAAAA CGTAAATCCA 3060 

35 AAGAAACAGA CTGAAAAGGT TTCAACTTTA AGTAAAAGAC CATTTAATGT TGTCATGACG 3120 

CCATCTGATA AAAAGCGTAT GATGGATCGT AAAAAGCATT CAAAAGTCAA TGTGCCTGAA 3180 

TTAAAGCCTG TACAAAGTAA GCAAGCTGTG AGTGAAAGAA TGCCTGCGAG TCAAGCCACA 3240 

CCATCATCAA GATCTGATTC ACAAGAGTCA AATACAAATG CATATAAAAC AAATAATATG 33 00 

ACATCAAACA ATGTTGaGAA CAATCAACTT ATTGGTCATG CAGAAACAGA AAATGATTAT 3360 

CAAAATGCAC AACAATATTC AGAGCAGAAA CCTTCTGTTG aTTCAACTCA AACGGAAATA 342 0 

T7TGAAGAAA GTCAAGATGA TAATCAATTG GAAAATGAGC AAGTTGATCA ATCAACTTCG 3430 

TCTTCAGTTT CAGAAGTAAG CGACATAACT GAAGAAAGCG AAGAAACAAC ACATCCAAAC 354 0 

AATACTAGTG GACAACAAGA TAATGATGAT CAACAAAAAG ATTTACAGTC ATCATTTTCA 3600 

AATAAAAATG AAGATACAGC TAATGAAAAT AGACCTCGGA CGAACCAACA AGATGTTGCA 3 660 
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CCAAGTGTTT CATTACTAGA AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 3780 

GATAAAAAGA AAGAACTGAA TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 384 0 

5 GATGTAACTG AAGGTCCAAG TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3 900 

GTTTCAAGAA TTACGGCATT ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 3960 

CGTATAGAAG CGCCTATTCC AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4020 

10 

CCAACGACAG TCAACTTACG TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4080 

AAATTAACAG TTGCGATGGG GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 414 0 

AAAACGCCAC ACGCACTAAT TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 4200 

75 

AGTATTTTGA TGTCTTTACT ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 4260 

GATCCAAAAA TGGTTGAATT AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 4320 

ATTACAGATG TCAAAGCAGC TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4380 

20 

CGTTATAAGT TATTTGCACA TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 4440 

CCCATATGAT GAAAGAATGn CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4500 

GATGATGGTC CGCAAGAAGT TG 4522 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

3S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TCAAGTTTAC GGATACGTAT ATATTTTGCA TGACATTTAG TGCAATAATA TTCATAATTT 60 

■ 

GCCCGTTGTT GATAGCTTTC AATGCTGTTA CAAAATCTAG GCGCTCCAAC CTGTTGGCTC 120 

AATCGTTTAA AATCTTGATC TTTATGTTGA TAACCTTTAC CAGCAATATG CAAGTGATAA 180 

TGACACAATT CGTGCAGTAT AATTTTTACA ACAGCATCTT CTCCATAATG CTCATATTGT 24 0 

TTTGGATTAA TTTCAATATC ATGGGACTTT AAAAGATAAC GTCCGCCTGT TGTACGTAAC 300 

45 

CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA ATTTTTCTTC CGAAAGATTC 360 

TCAACCATTC GCTGAAGTTT GTCATTATTC ATGTGGATCA ATCATCGTTA ATGATACTTT 4 20 

GTCTTTATTT TTGTCAATAC TGTAAATCCA AACGTCAACG ATATCACCAA CACTGACAAT 4 80 

SO 

ATCCATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA ACATGGACAA GTCCATCTTG 540 
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20 



TTTCATTCCT TCXTGTAAA t CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 660 

AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 720 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACGCGACCA 60 

ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 120 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 18 0 

GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATTAGCAATT 240 

25 AATTGCACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 300 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 360 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GACCATATCT 420 

30 TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 80 

GTAAACACGC CTTCAAATAC AAGTTGCTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTTAATTG CTTCTTTCAA CCACTGTTTA 660 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 720 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 780 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 84 0 

CTACCTAGTC CAT AGG CATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 900 

GTACTGAATA CTTTGAAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACGCAGAT 960 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 1020 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 1076 
(2) INFORMATION FOR SEQ ID NO: 42: 
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(A) LENGTH: 293 0 base pairs 
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(C) STRAND EDNSSS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 24 0 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 480 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

25 TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 780 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGACGTGACA TTCGAGGATT 840 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

■ 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 13 80 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 1440 

SO GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 1560 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 1680 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 1740 

5 GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTT7A CAAATACAGC AAATGCAATG 18 60 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 1920 

10 GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2040 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

15 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 2160 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2220 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 22 BO 

20 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 2340 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 00 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 60 

25 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG. GGTATGTTTT ATAAATGTTC GATAATACAC 2580 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

30 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2700 

AGGTAACTAT ATATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 2760 

35 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 2880 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2930 

40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 606 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

50 

CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 60 

So 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT 
ATTATAATAA GCGTATAATT GTCGCATATT 

5 GTATTTATGA ATAAAGACAA GCAATTGCAC 

TTATTAGGGT TAGGCTCTTT AATAGGATCT 
TCAATAGCTG GACCAGCAGC AATCATATCA 

10 ATTGCCTATA ACTACATTGA AATCGGCACA 

TATGCCCAGT ATACACATGG CTCATTATTA 
TCTTTGGTGA CAATAATACC TATCGAAGCT 

15 CCGTGGCATT GGGCGAAACC AATGAGATAT 

GGATTGCTAG CTGTATATCT CATCATTGTT 
AAACTTTTAA CATCATTTAC GAGTTTAATT 

20 

ACCATCATCA TGTTGATGCT ATCAGGATTC 
ACATTTATGC CTTACGGAAG TGCACCGATT 
TTTTCATTCA ATTCATTCCA GACAATTATT 

25 

AAAAATATCG CAAGAGGCAT CGCTATCTCA 
TTACAAAGTA CGTTTATCAC TTCTATGCCT 
GGCATCAACT TCAATTCACC ATTTGCTGAT 

30 

GCAATTTTAC TATACATTGA AGCTTTTGTA 
GCCGTTACAG GTCGAGTTTT ACGAGCAATG 

35 GGGAAGATGA ATGAAAAATA TCATATCCCA 

• AGT/STGATTA TGGTTACATT ATTTAGAGAT 
GCAACTTTAG TAGCCTATTT AACTGGCCCA 

40 CCAACAATGA CTCGTCCATT TAGAGCAAAA 

GTATTAGCTT CATTAGCTAT ATATTGGGCA 
ATCATTATAC TTGGATTACC AATCTACTTC 

45 ACAAAGAAAC AAATTGGTGG TAGCTTATGG 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA 
TTTATCGTTA T7ATTATTGT GGCACTTATC 

50 GAAAGCGTCT ATTTCCGTCG CGCAACACGA 



GCGTTCTTAC AAAAAATGCA TTTGACTAT7 180 
ATTTTTTGTA TTTTTCGCAA TAACGAAGGA 240 
AACGACAAAA TCAATCTATC CCAATTAGTC 300 
GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 360 
TGGGTTCTTG GATTCCTAGT CATTGGAACC 420 
ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

GGCTTTATTG CTGCTTGGGC GAATTGGGTG 540 
GTGTCAGCTG TTCAATATAT GAGTTCTTGG 600 
TTAATGGAAA ATGGCTCTAT TAGCACATAC 660 
ATTTTTTCAT TATTAAACTA TTGGTCCGTA 720 
TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 
GACACTTCAA ATTACGGCCA TTCGGCAAGC 840 
TTTGCTGCAA CAACAGCATC AGGGATTATT 900 
AATATGGGTT CAGAAATTAA AAATCCTGAA 960 

CTGTCAATCA GTGCAGTGTT GTACATCATT 1020 

CAATCAATGT TACAACATAG TGGATGGAAT 1080 

TTAGCTATCT TATTAGGAAT TAATTGGCTC 114 0 

TCACCATTCG GTACTGGCGT GTCATTTGTC 1200 

GAGAAAAATG GACATATCCC TAAATTTCTT 1260 

CGTGTAGCAA TCATCTTTAA TGCCATCATT 1320 

TGGGGTACGC TAGCAGCAGT TATTTCTACT 1380 

ACGACAGTGA TTGCATTAAG AAAAATGGGA 144 0 

ATTTTAAAAG TAATGGCACC ATTATCATTT 1500 

ATGTGGCCAA CAACGGCTGA AGTTATTTTA 1550 

TTCTATGAAT ATCGTATGAA TTGGCGTAAT 1620 

ATTATTGTAT ATTTAATCGT GCTATCAATA 1680 

GGCTTAAATA TGATTCACTA TCCATTTGAC 174 0 

TTCTATTACA TCGGTACAAC GAGTTCATTT 1800 

ATCAATACGA AGATGCGTGA GTCACTAAAT 1860 
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CACACACATT AACCAACCAT TGATTTCAAC 
GTTATAAATA ACTAACATAA CAAGATGATG 

5 GCTCTACGAA gTTATATTGG CAGTAGTTGA 

TTTCGATTCT AGTCAGGGGC CCCAACACAG 
GCAAGTTGGG GTGGGACGAC GATAAAGAAA 

10 TGCATGAGTT TTACTCATGT ATTCATATTT 

AAGAACCACT ACATAATAAA TCATTTGTGG 
AAGTACATCA TATAATGCTG AAAATGGTTT 

15 TCAGTCACAA AATTGTCTTG TTATACTTGT 

AAATTGGGGA TGATAAAGGA GGTTAATAAA 
TCATGGTATT ACACAAGATG GTTTAGATAT 

20 

AGTTCATGAC AATCGCTTTA AACATTCTAC 
TGCAACTGAA ATACAACCCA TCCCACCGCA 
CTCACAATCT ACAACTTTTA CTGAACATGA 

25 

ACATAATGAT CAGACGAAGA AACCTGTCAT 
TGGTCATGGT ACAGCCGAAC TCTATCAACC 
CGTTATTACA TGCAATTATC GTTTAGGCGC 

30 

TAAAGATTTT CATTCCAATA ATGGCCTTTC 
T CAATTTATT GAATCCTTCG GTGGCGACGC 
AGGCAGTATG AGCATTTTGA CTTTACTTAA 

35 

AGTQGTTCTA CTAAGTGGCG CACTACGATT 
ACAACATTTC CAAAAAATGA TGCTCGATTA 

4Q GACAAATGAT ATTCTTATGC TGATGGCGAA 

GCTTGATTTA ATATATGCGC CTATTAAAAC 
GAAACCAATT TTTGCATGTT ATACAAAAGA 

45 GAAAAAATTA TCGCCGCAAC GCT7TATCGA 

ATACGAAGAT GTTCAGACGG CGAAGcAACA 
ACAGCCGATG aAGCAATTTT TACmACmACT 

50 TGGCTT 



ATCTTGGTTG GTTTTTTATT TTGAAAATCG 1980 

ATCAGGCTGG GACATAAATC AATGTTCTAT 2040 

CTGAACGAAA ATGCGCTTGT AACAAGCTTT 2100 

AGAATTTCGA AAAGAAATTC TACAGGCAAT 2160 

TACTTTTTCT ATAGAAATTA GTATytCTTA 2220 

TTAAGTACAC ATTAGCTGTG GCTAATGTAT 2280 

CTCTTTATCA TTTCTGTCCC ACTCCCGTAG 2340 

GAGTTAAAAC AGATATCAAG CTCGTCTGAT 2400 

CACCTATCAT CTATAGACCG TGGTATGATT 2460 

TATGAAGATT AATACTACAG GTGGTCAAAT 2 520 

CTTCTTAGGC ATTCCTTATG CAGAACCACC 2580 

GTTAAAAACA CAATGGTCAG AGCCAATTGA 2640 

ACCAGACAAC AAATTAGAAG ATTTTTTCTC 2700 

AGACTGTTTA TATCTAAATA TTTGGAAACA 2760 

CATTTATTTT TATGGTGGTA GTTTTGAAAA 2820 

GGCACATTTA GTACAAAATA ACGACATTAT 2880 

ATTAGGATAT TTAGACTGGT CATATTTTAA 2940 

AGATCAAATC AATGTCATAA AATGGGTGCA 3000 

TAATAACATT ACTTTAATGG GTCAGTCTGC 3060 

AATACCTGAC ATTGAGCCAT ACTTCCATAA 3120 

AGACACCCTT GAGAGTGCAC GCAATAAAGC 3180 

TTTAGATACA GATGATGTTA CATCATTATC 324 0 

gcTAAAACAA TCTCGAGGAC CTTCTAAAGG 33 00 

AGATTATATA CAAAATAATT ATCCAACAAC 3360 

TGAAGGCGAT ATTTATATTA CTAGTGAACA 3420 

CATTATGGAA TTAAATGATA TTCCTTTAAA 3480 

ATCTTTAGCG ATTACACATT GTTATTTCaA 3 540 

CAATATACmA GATTCCAACC GCACCAACTA 3600 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

10 GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

" AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACsAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

15 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 240 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 480 

TATAGCTAGA AAGTTAGATA TTTGTATTTT TTTAAATAAT AAGTGCCGTT GTTATCGTTC 540 

25 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 660 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAGGTTTA AAAGGATTAC AACAGAATGT 720 

30 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 78 0 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 84 0 

GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

3o 

GTGAGTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 96 0 
ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT ' 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACGCATTGCC AATTTTAAAT 1080 

GAGCCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AG ATT CAT CT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 1260 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTCCA 132 0 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG 138 0 

50 ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 
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GTACCACCAT TAACACACTT TATAACAAAT TCTACTTACT TT TT ATGGGA TGTTTTAACG 
ACTG CTTAT A TTGGTAACAA GGACTTGGTT CATTCAATTG AGAAAAAAGT CGATGTAATA 
AGTTATGGAC CAAGTCAAGG TAAGACATTT GAGTGTAAAG ATGGGCGCAA AATTAATGTC 
ATAAATCATG TAGATAACAA CGCATTTTTT GATTATATAA CTGCACTTGC 7AAAAAAGTA 
AATTAACAGC TGTGTAGAAT AATTAAGGTT TTAATTTATA TAGAACAACT TATTGTAAAC 
TTTTCATTTC TTAAAGTTTA CAATGGTGCT ATAATAATGG TCATGAAATA CGAAAGGAAG 
TAAAAAATGA CAACAAAACA GTTAGTATAT ACAGCTTTAA TGACAGCGAT TATCGCTATT 
TTAGGATTGG TACCGGTAAT TCCACTACCA TTTTCTTCAG TACCAATTGT ACTTCAAAAC 
ATTGGTATTT TCTTAGCAGG TGCGATTTTA GGACGTAAAT ATGGCACATT AAGTGTTATC 
GTCTTTTTAT TATTAGTAGT TGCTGGCTTG CCATTGTTAT CAGGTGGTCG CGGTGGCATC 
GGTGTATTCG CAGGTCCTTC AGCAGGGTTT TTACTATTAT ATCCAGTTGT AGCATTCATG 
ATTGGGGCGA TTCGAGATAG ATTCATCAAT GAAATTAATT TCTGGATTTT ATTCGTTGGT 
ATTTTAGTTT TTGGTGTTAT AGCATTAGAT GTTATTGOTA CATTGATTAT GGGCATGATT 
ATTAACATAC CATTTACGAA AGCTATTTCA ATTTCATTAG CTTATTTGCC TGGTGATATA 

* 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT ACAGCTTTAC TTAATCACTC GCAGTTTCGT 
CAAATTATGG GAATAAAATA ATCATATTTA AGATAGTAAA GTAATTGAAT AAGTTGC7TT 
GAAATTTATA AAAGTGAAAG GAGTAGGTGT CAATGGCTAG TATAAGTATG TCAGATATAT 
ATTGTAACGG CACTATATTT GAAAATGACG ACGAGCAGTT GATTTATTTA ACGCCTTCTT 
TTCCACAACG ATACACAAGT AACACATGGA TATATAAAAA GACGCCTACC CAAGAGCGAT 
GGCTGAAAGA CTTAGAACGT CAACATCAAT TACATACAAA TCAAGGTTCA AATCATTATG 
CGTTTAGTTT CCCGGAAAAT GAACAACTTG ATAATCATTG GATGGCTA7G TTTAAAGATA 
TGAATTTTGA ACTAGGTATT ATGGAATTGT ATGCCATAGA AAGTGATGCG CTTGCCAATT 
TGCCGCGTAA CTCTGACGTT GAAATTGCCA TCGTTGACGA GTCGCATATA GATGCCTATT 
TAAAAGTTGC ATATCAGTTT AGTTTGCCAT TTGGAAAAGA CTATGCAGAT GCACATGAAG 
AAATGGTAAG GGAACATTAT CAAAAAGATG TGATTAAACG CTTAGTAGCT TATTTAAATA 
ATGAACCTAT TGGCGTTGTA GATGTCATTG AAAGTGAAAA TTACATTGAA TTAGATGGAT 
TTGGTGTATT AGAACAATTT CGGCACCAAG GAATTGGATC TACAATTCAA TCGTTGATAG 
GTGAATACGC CATATCAAAA AATCACAAAC CAATCATATT AGTTGCAGAT GGTGAAGATA 
CAGCAAAAGA TATGTATGCA AAGCAAGGTT ATGTCTATCA ATCGTTTTGT TATCAAATAT 
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1620 

1680 

1740 

1800 

I860 

1920 

1980 

2040 
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2160 
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2280 

2340 

2400 

2460 

2520 
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2700 
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3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTGA GCTACTTATA 3360 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 3420 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 34 80 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 3540 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 36 00 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 3660 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGTCTTGT 3720 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 3780 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 3840 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3900 

20 CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4 020 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 40 80 

25 AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 4140 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 4260 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 432 0 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 438 0 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 444 0 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 500 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4560 

CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4 620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 4 680 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4740 

45 TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 800 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 920 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4 980 

GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG t GX A7AAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 5400 

ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 5460 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT S640 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

25 TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

30 AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAG7G AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 6240 

CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6720 

SO CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6340 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6360 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

. CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 7080 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 7140 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 7200 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTTATACCT GGTAAAACAC 7260 

CTTGCTTTAA CTGTTTGGTA CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 7320 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 73 80 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 7440 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 7560 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 7680 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACGCACA TCAGATGCCA 7740 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7800 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7860 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7920 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7930 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCATCATT ACGACTGGTG 8040 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAtAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 8280 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 8340 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 8400 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 8460 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

50 ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 8580 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 3640 
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AATGCTTGAA TGAGCGACAG CAGTTCTTTT TGTAATTTGT TTGTCTGATA CATCGACCAT 8760 

TTTGGCGTGG CCTTGTTGAT TAATATGAGT AAACTCAGTC ATTTTACCCC TCCTAGTGCA 8820 

TCTAGTATAT CATGAAAAAA TAAAAGTTTT GGAGATGATT TTTAATGGTA GTAGAAAAAA 8880 

GAAACCCAAT CCCAGTTAAA GAAGCAATTC AACGTATCGT TAATCAGCAG AGTTCAATGC 8940 

CGGCAATTAC GGTAGCACTT GAAAAAAGTC TAAATCATAT CTTAGCAGAA GATATTGTAG 9000 

CTACTTATGA TATACCAAGG TTTGATAAAT CACCTTATGA TGGTTTTGCA ATTCGCAGTG 9060 

TTGATTCACA AGGGGCAAGT GGTCAGAATC GCATTGAGTT TAAAGTGATT GATCATATTG 9120 

GTGCAGGTTC AGTTTCTGAT AAATTAGTTG GGGATCACGA AGCGGTGCGT ATTATGACTG 9180 

GAGCACAAAT ACCTAATGGC GCAGATGCTG TTGTTATGTT TGAACAAACG ATTGAACTAG 924 0 

AAGATACATT TACAATTCGT AAACCATTTT CAAAAAATGA AAATATATCT TTAAAAGGTG 9300 

AAGAAACAAA GACAGGCGAT GTTGTTCTAA AAAAAGGACA AGTAATTAAT CCAGGGGCTA 9360 

TCGCGGTCCT TGCAACATAT GGCTATGCAG AGGTTAAAGT TATTAAGCAA CCGAGTGTCG 9420 

CTGTTATTGC AACAGGAAGC GAATTATTAG ATGTTAATGA TGTATTAGAA GATGGGAAAA 9480 

TTCGTAACTC TAATGGCCCA ATGATTCGTG CCTTAGCAGA AAAATTAGGT CTTGAAGTTG 954 0 

GTATTTACAA AACACAAAAA GATGATTTAG ATAGTGGCAT CCAAGTCGTT AAAGAAGCTA 9600 

TGGAAAAACA TGATATCGTT ATTACAACGG GCGGAGTTTC TGTTGGAGAT TTTGACTATT 9660 

TACCTGAGAT TTATAAGGCT GTAAAGGCGG AAGTGTTATT TAATAAAGTA GCAATGCGTC 9720 

CTGGTAGCGT AACAACGGTT GCATTTGTAG ATGGaAAGTA TTTGTTTGGa TTATCTGGAA 9780 

ATCCATCAGC TTGTTTTACA GGATTTGAAC TATTTGTGAA nCCAGCTGTT AAACATATGT 9840 

GTGGCGCACT AGAAGTCTTC CCGCAAATAA TTAAAGCAAC ATTAATGGAA GATTTTACCA 9900 

AGGGAAACCC ATTCACACGA TTTATACGTG CTAAAGCAAC GTTAACAAGT GCTGGAGCTA 9960 

CTGTAGTACC TTCAGGATTC AATAAATCAG GTGCGGTTGT AGCGATTGCA CATGCTAACT 10020 

GTATGGTCAT GTTACCAGGA GGGTCACGTG GTTTTAAAGC GGGGCATACA GTAGATATTA 10080 

TATTGACTGA ATCTGACGCT GCTGAAGAGG AACTTCTTTT ATGATTTTAC AAATTGTAGG 10140 

TTACAAAAAG TCTGGTAAGA CAACATTGAT GAGGCATATT GTCTCTTTCT TAAAGTCACA 10200 

TGGTTATACA GTTGCTACTA TTAAACATCA TGGGCATGGT AAGGAAGATA TTCAATTACA 10260 

GGATTCAGAC GTCGATCACA TGAAGCATTT TGAAGCGGGG GCAGATCAAA GTATTGTACA 10320 

AGGTTTTCAA TATCAGCAAA CTGTAACACG TGTAGATAAT CAAAATCTTA CTCAAAT7AT 10380 

TGAAAAATCT GTTACAATTG ACACCAATAT CGTATTAGTT GAAGGCTTTA AAAATGCTGA 10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA 
GTTATTAAAT AAAATTAAAA ATGATTGTGA 

5 TGAAACAATT TGAAATCGTG ACAGAACCGA 

TAAATGAATA TCAAGGTGCA GTAGTTGTTT 
GCGTCAAAAC GGAATATTTA GAATATGAAG 

10 CACAAATTGG AGATGAAATA AATGAAAAAT 

GAATAGGGCC ATTACAAATT TCAGATATCG 
GTAAAGATGC CTATCGAGCA AATGAATATG 

15 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT 
ATGAAGAAGC AAAGAGGGAG GAATAAGAGA 
AGATATATTA CAAAAAGCAC AGGAAGATAT 

20 

ATTTGAAGAT TTATTGTTTG AACGTTATCC 
TGTAAATGAG GAATTTGTAC AAAAATCGGA 

2s AATTCCACCG GTTAGTGGAG GTTAAGGGAG 

TTCAGTGCGA TTTGGTAAGC CCAAAGCTTT 
TAGAGTAATT AAGACATTAG AATCAACAAA 

30 TGCGCAATTG GCAACGCAAT TTAAATATCC 

TGATAAAGGT CCATTAGCAG GAATTTATAC 
GTTTTTTGTC GTTTCTGTTG ATACACCAAT 

or 

TCAGTTTTTA GTTTCTCATC TTATTGAAAA 
TGGAtGTTTT ATTCCAACAA TTGCATTTTA 
AGCACTACAT TCTGATAATT ACAGTTTTAA 

40 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC 
TGATTTGGAC GCTTTAATTC AAAAATTGTA 
AAATAAAAGA TAAACTAGGA CGTCCCATCC 

45 

GTAACTTTAG GTGTGATTAT TGCATGCCTA 
TACCTAAAAA TGAACTTTTA ACGTTTGATG 
50 AATTAGGTGT AAAAAAAATA CGCATTACAG 

ATGTACTTA7 AGCTAAATTA AATCAAATCG 

55 



GCATGAAGAT TTTACAG CAT TTGAGCAATG 10560 

TACACAATTA ACATAGAGGA TTGAAATGAA 10620 

TACAAACAGA ACAATATCGT GAATTCACTA 10680 

TTACCGGTCA TGTTCGCGAA TGGACTAAAG 10740 

CGTATATTCC AATGGCTGAA AAGAAATTGG 10800 

GGCCTGGAAC GATAACGAGT ATTGTTCATA 10860 

CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

CAATTGAGCG TATAAAAGAA ATTGTTCCGA 10980 

CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

GATGAAGGTA CTTTACTTCG CAGAAATTAA 11100 

TGTGCTTGAA CAAGCATTGA CTGTACAACA 11160 

GCAAATCAAT AATAAAAAGT TTCAAGTTGC 11220 

TTTCATTCAA C CTAATG AT A CTGTTGCATT 112 80 

CATGAAAGCA ATAATTCTTG CAGGTGGTCA 11340 

TGCGGAAGTG AACGGTGAGA CCTTTTATAG 11400 

TATGTTCAAT GAAATTATTA TTAGTACAAA 11460 

AAATGTTGTT ATAGATGATG AGAATCATAA 11520 

AATCATGAAG CAACATCCTG AAGAAGAATT 11580 

GATTACTGGT AAAGCTGTAA GCACGTTGTA 11640 

TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AAATGTATAT CATGAATTAT CAACGGATTA 11820 

ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

AGCTGTTAGG AGGTCCACAA ATGGTAGAAC 11940 

GTGACTTACG GTTATCTGTG ACAGATCGGT 12000 

AAGAGGTATT TGGAGATGAT TTCGTATTTT 12060 

AAATGGCTAG AATCGCTAAG GTATATGCAG 12120 

GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGGTATTGA AGATATTGG7 TTGACTACAA 12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC 
AAGCGACTAC GATTTTAGAA CAAATTGATT 

5 TAAATGTTGT TATACAAAAA GGTATTAACG 

TTAAAGATAA ACATATAGAG ATTCGATTTA 
GATGGGATTT CAGTAAAGTT GTAACTAAAG 

10 TTGAAATCGA TCCTGTAGAA CCAAAATATT 
AGGATAATGG TGTTCAATTT GGTTTGATTA 
GTACACGCGC AAGGCTGTCA TCAGATGGGA 

15 

ATGCATTTAA CGTTAAAGCG TTTATTCGTT 
AATTTAAAGC TTTATGGCAA ATAAGAGATG 
CAGTTGCCAA TCGTCAACGT AAAAAGATAA 

20 

GGACCACTAC ATATTAAATC ATTAGAGATG 
ACAATATTAT TTATTAAAGT AAAAACGGTC 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC 
TACAATAATG TGCAAGTTGG CGGGGCCCCA 
GACAATGCAA GTTGGGGAAC GGGGCCCCAA 

30 TAATGTGCAA GTTGGCGGGG CCCCAACATA 
GCAAGTTGGG GATCAACGAA ATAAATTTTA 
AATCACTACA TAATAAATCT TTAGTGGTTC 

35 GAGTTGTAAT ATATCTTTTT TAGGTATAAA 
AGATATAAAT CTAAACAAGA TATAGCCAGC 
AGTTTGATAT ATAATAAATT TAAGTAATTG 

40 

AGAAACATAG GAGGCATCAT ATTATGAGTA 
GGGAGTTAAG TCAGTTAAAG CACTGGTTAA 
TTGTAGTCCT TTTTAAAGTG TATGAAGCTG 

45 

CATTACATTT TGAAATGCTA TGGGATACAA 
ATAAAAAAGA GCTTATTTCT AAATTGCGTT 
50 TCTATAGTAC TTCTCAAAAG AAATTGTTAG 
GCGTTACAAA CTAAAAACTT aAAAAgcaTG 



TATTTCAATC AATCAATAAT CGTAATATTA 12360 

ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TAGAATTTAT GGATGTTGGT AATGATAATG 12540 

ATGAAATGCT TACAATGATA GAGCAGCACT 12600 

TTGGGGAAGT AGCAAAATAT TATCGCCATA 12 660 

CAAGTGTTTC ACAATCATTT TGTTCTACAT 12720 

AGTTTTACGG ATGTTTATTT GCAACTGTCG 12 780 

CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12840 

ATCGATATTC AGATGAGAGA ACTGCTCAAA 12 900 

ACATGAATTA TATTGGTGGT TAATGTGTAG 12960 

TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13 020 

ATATCTATGC CAGATTTAAT AGAAATGATC 13 0 80 

CCCAACACAG AAG CTGACAG AAAGTCAGCT 1314 0 

ACATAGAGAA TTTCAAAAAG AAATTCTACA 13 200 

CACAGAAGGT GACGAAAAGT CAGCATACAA 13 260 

GAGAATTTCA AAAGAAATTC TACAGACAAT 13 320 

TGAGAATATC ATTTCTATCC CACTCTTAAG 133 80 

TTTAACATTG ATGTCACACT CCATGCCATT 1344 0 

TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

TATAATAATA TGAATTACAA ACATCTAAGA 13620 

ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 13630 

AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13 800 

GTAAAATCGA TGTGATTATC CGTAAAaTCT 13 86 0 

CTGAAACGGA TGAAAGACAA GTATTCTATT 13 920 

ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

CCAATCTCTA TTCATCATAA TTGCGTCTTG 14 040 



55 



366 



EP0 786 519 A2 



TO 



15 



20 



25 



30 



35 



GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 14160 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14220 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 14280 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14340 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 144 00 

GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 144 60 

- ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14520 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14580 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14640 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14700 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14760 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCG7 ATTTTGTAAT 14 820 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14880 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14940 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15000 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

TAATCGTTTA GGTCCrATTT sATTTACAAA TTTACCTGTA GCAAATCGA 15109 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 
{B} TYPE: nucleic acid 

(C) STRAND EDNESS : double " 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 
TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 
CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 
GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 
TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 
GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 



60 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 4 90 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 54 0 

CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 600 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 720 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 78 0 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 84 0 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 1140 

TTTGTTTAAC AATTG CG AT A ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 1380 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 1440 

TTGCGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 1500 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 1560 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

GCAGAACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 16 9 0 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 1800 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 1360 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2040 

ATGAGACAAA AACGTG7TGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 2160 
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ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 2280 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG 2340 

5 GATACAGTGC AAACAGAACA TGTGGGGGAA GACAcTGCGT CCTCAAAAGT GCATGAGCAA 2400 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 2460 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2520 

10 

CATTATTACT TGCCATCATT GTTGCAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG 2640 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

15 

TTATTTATGT ATTATTACCT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 2760 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

TTGAATTGCC GTTAGCATTG CCGCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 2880 

20 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT GATTTCATTT 2940 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3000 

25 CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG 3060 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 3120 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 3180 

30 AAGAGCACGA AAAATGATGT CAAAATTACA GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA 3300 

GTAAATAATT TAGGGTCAAG TACGATTCAA CATAATGCCT TAATTAATGG GGATGCTAAT 3360 

35 ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGCACCAATT 3420 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 34 80 

ACGTTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

40 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AGCATAGTAA AGATTTACGT 3600 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3660 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATGCAAA TAGGTCTAGT CTACGACGCA 3720 

45 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA GGTTATTCTA CAGATGGTCG AATTGCGGCG 378 0 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3840 

SQ GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAGTTG 3 900 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CGCTTGAATT ATGAAGCGGA TGGTAAAGGT 3 96 0 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT 
CGAACTTTGG TTATCTATGG GATTTATTTT 
5 TGCTGTTTGC AgCTTTAATT GGTATTCCAT 

TTTCTGGATT TGTAATTACA ATTGCAAATA 
TAGCTATTTT AATGTTAGTC ATGGGCTTAG 

10 

TATATGCGTT ACTTCCAATT ATAAAAAACA 
ATATTAAGGA TGCTGGCAAA GGTATGGGAA 
AATTACCGTT ATCTGTTTCG GTTATTATCG 

15 

TAGGTGTTGT TGCCGTTGGA TCATTTATAG 
GTGGTACAAA TGCGACGGAT GGCACAACGT 
TCATTGCAAT CGTCATTGAT GTACTATTAA 

20 

CACGACATCG TAAAAATCAA TCTAATCATC 
AGATGTTTAT AATTTAGCGA TTTCGTTTCA 

25 GCTCAAATAA TCTTTGAGTA GCCTTTTTAT 

TAGCAATTAT TATCATGAAA GTTTTTGGAT 
AAAATGAGAT GTTTTATTTA TAATTTTCTG 

30 ATATTAAAAA TTTGTTTTTC TTAAACATAG 

ACAATTTAGA GGTGACAACC GATTGCTTTT 
GCTATTCGCG CAGTCACTTG TTAATCTTGT 

55 CGTTGGAACG ATAAATATCG CTGTTAGCTT 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG 
ATTAAATGTT GTAGGTTCAT TACTCATCAT 

40 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC 
TAACGAATAT TATATTGGTA CAAGAAGACA 
TTGGGGTGGT AGTGGTATTT GTACGTTGTT 

45 

GCGTTCAATA TTTGTTGTTT CAATTCTATT 
TGCACCTGAG ACTAAAGCAG AACCAATCAA 
50 CGTTATTGGT TTAGTCATTT TAGTAGTGAC 

GACGTCTCAT TTTGGTTTAG TTTCACCGTT 



TATTACAGCA ATTATTCAAT TATTATGTTA 4080 

TCAAACACTT ATTAATGTCT GTCTATGGTG 4140 

TGGGAATCTT GCTTGCaAGA TACACAAAAC 4200 

TAATTCAAAC AGTTCCAGTC ATTGCAATGT 4260 

GTTCAGAAAC AGTAGTTTTA ACAGTGTTTT 4320 

CTTATACTGG TATAGCTAGT GTTGATGCGA 4380 

TGACACGCAA TCAAGTGCTA CGAATGATTG 4440 

GTGGCATTCG TATTGCCTTG GTTGTTGCGA 4500 

GAGCACCTAC GCTTGGTGAC ATTGTGATTC 4560 

TTATTTTAGC AGGTGCGATT CCGATTGCTA 4620 

GATTTTTAGA AAAACGATTA GACCCAACAA 4680 

GGCCGCAAAG TATTAATATG TAATAGTAGA 4740 

TGATTTATAA AAAATGAGGC TACTCAAGGA 4800 

AGGTTGTGTT TGTATGCGTT TACACTAAAA 4860 

AAAAAGCGTT AATTATTGTA AAAATACTAA 4 920 

CAAATTTATG ATATTGTTTC TTAATATATC 4 980 

GAGGCTTATC TAATTCATGG ACACATCAAA 5040 

GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

ATCTGCCTTA TTTGCTGGTT TGTTTATCGT 5220 

TCGCGTCAAA ATTACTTATG TAGGATTGAT 528 0 

CATTACACCT TTGCCAGCAT TTTTAATTAT 5340 

ATGTATTATG CCATCAACAC TTGCTATTAT 5400 

ACGTGCCTTA AGCTATTGGT CTATTGGTTC 5460 

TGGTGGCTTA ATGGCTACAT ATATAGGTTG 5520 

AACATTATTA GCAATGTACT TAATCAAACA 5580 

AGGTATGAAA GCAGAAGCTA AAAAGTTTGA 5640 

GATGTTAAGT TTAAATGTAA TCATCACACA 5700 

AATTCTAGGT TTAATTGTTG TGTTTATCTG 5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 5880 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 5940 

TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 6120 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 6180 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 624 0 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGGTTTATAC 6300 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA TGTTTAATGC 6360 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 6420 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT 654 0 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGCGAATGAT TTTCAAATTG ACGCTAATAT 6600 

GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 6660 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6720 

ATATGATGTC ACACT7ATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AGCAACATGG 6780 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAGTTAAAC ATTCCGATGT ATCATTTTAA 6840 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6900 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6960 

35 GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7020 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 7140 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 72 00 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 7260 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 7440 

AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 7 500 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 7560 
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CAATCACGTG 


ATATTACGGT 


CATTATTAAG 


ATTGAAATGT AATAAATAAA 


GAACAGCAGT 


7680 




AAGGTACTTT 


CAAATTGAAA 


TGATCTTGGT 


GCTGTTTTTC 


TTGATTGATC 


TTCGTCATAA 


7740 


5 


TTCAGATTTG 


TCATAGGCTA 


CGACATACTA 


TTAGTATTTA 


CTAGACAGTT 


TTTACGACGA 


7800 




CACTTTGAAA 


AATTTTGAGG 


CAAATCATTT 


GGAAGTCTCA 


CGTGAATTTT 


GTAAACTCAT 


7860 


10 


CAAGCAAGTA 


ATTATATTAA 


AAAGACAAAT 


AGAGAAAAGG 


TGTTTATAAT 


GAGTAAAATT 


7920 


TTTGTAACTG 


GTGCAACGGG 


CCTTATTGGC 


ATTAAATTAG 


TTCAAAGACT 


AAAAGAAGAG 


7980 




GGGCATGAGG 


TTGCTGGTTT 


TACTACATCT 


GAGAATGGTC 


AACAAAAGCT 


AGCTGCTGTT 


8040 


15 


AATGTAAAAG 


CATATATTGG 


TGATATATTA 


AAAGCTGATA 


CTATTGATCA 


AGCGTTAGCA 


8100 


GATTTTAAAC 


CAGAAATCAT 


TATCAATCAA 


ATTACGGATT 


TAAAAAATGT 


TGATATGGCA 


8160 




GCAAATACGA 


AAGTACGTAT 


TGAAGGTTCT 


AAAAACCTAA TTGATGCGGC 


GAAAAAGCAT 


8220 




GACGTTAAGA 


AAGTAATTGC 


CCAAAGTATT 


GCCTTTATGT 


ATGAACCTGG 


CGAAGGATTA 


8280 




GCAAATGAGG 


AAACTTCACT 


TGATTTTAAC 


TCAACTGGCG 


ATAGAAAAGT 


AACGGTTGAT 


8340 




GGTGTGGT7G 


GTTTAGAAGA 


AGAAACGGCT 


CGTATGGATG 


AATACGTTGT 


TTTACGTTTT 


8400 


25 


GGCTGGTTAT 


ATGGCCCAGG 


TACTTGGTAC 


GGAAAAGATG 


GCATGATTTA 


TAATCAATTT 


8460 




ATGGATGGTC 


AAGTGACACT 


TTCAGATGGC 


GTAACATCAT 


TTGTGCATCT 


TGATGATGCA 


8520 




GTTGAAACAT 


CTATTCAAGC 


TATTCATTTT 


GAAAATGGTA 


TCTATAATGT 


AG CAGATGAT 


8580 


30 


GCACCTGTTA 


AAGGTTCTGA 


ATTTGCAGAA 


TGGTATAAAG 


AACAACTTGG 


TGTTGAACCA 


6640 




AATATTGATA 


TTCAACCTGC 


GCAACCATTT 


GAACGTGGCG 


TAAGCAATGA GAAGTTTAAA 


8700 




GCGCAAGGTG 


GTACTCTGAT 


TTATCAAACT 


TGGAAAGATG 


GCATGAATCC 


AATTAAATAA 


8760 


35 


TAATTTATCC 


GTTTAATATA 


CAAAGAATAA 


AGACTTGGTC 


GAATCGTGGA 


TGATATATTA 


8820 




TCAAACGCAC 


GGCTCGAACA 


AGTCTTTTTT 


ATTATGTCTT 


CGTTATCTTT 


GTATGAAGGA 


8880 


40 


• 

ATAACAGAAT 


TACAATTAAT 


GTACTGAATA 


ATGCAATTAA 


TGTTGTGATT 


AGTGCTAATT 


8940 


TAATTTCTAT 


TGGTAGCCAA 


GTCAGTACAA 


AAGACCAATT 


ATTGCTACCG 


AGAATGAGAT 


9000 




ATGGTAATGC 


ATATAATATG 


AGCGCTAAAG 


CGATACATAT ACATAATGAT AACCAACTCA 


9060 


45 


ATACAGCAAT 


CC 










9072 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT SO 

5 TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 240 

10 

TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACAT? ACAAGCGAAT TATATGACGC 360 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 420 

15 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 480 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 540 

TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

20 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 660 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

25 GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 780 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 840 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

35 TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 1140 

AAG7£CAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

40 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 1380 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 144 0 

45 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

50 ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 
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AAGCGATTAT TCGATGTAGT GAGTTCAATA TATGGTTTAG TAGTTTTAAG TCCGATTCTG 1900 

TTAATTACAG CATTACTAAT TAAAATGGAa TCACCTGGAC CAGCCATTTT CAAACAAAAA 18 60 

AGACCGACGA TTAATAATGA ATTGTTTAAT ATTTATAAGT TTAGATCAAT GAAAATAGAC 1920 

ACACCTAATG TTGCAACTGA TTTAATGGAT TCAACATCGT ATATAACAAA GACAGGGAAG 1980 

GTCATTCGTA AGACCTCTAT TGATGAATTG CCACAATTAT TGAATGTTTT AAAAGGAGAA 2040 

ATGTCAATTG TAGGTCCTAG ACCAGCGCTT TATAATCAAT ACGAATTAAT CGAAAAACGT 2100 

ACAAAAGCGA ACGTGCATAC GATTAGACCA GGTGTGACAG GACTAGCTCA AGTGATGGGG 2160 

AGAGATGATA TCACTGATGA TCAAAAAGTA GCGTATGATC ATTATTACTT AACACATCAA 2220 

TCTATGATGC TTGATATGTA TATCATATAT AAAACAATTA AAAATATCGT TACTTCAGAA 228 0 

GGTGTGCATC ACTAATGAGA AAAAATATTT TAATTACAGG CGTACATGGA TATATCGGTA 234 0 

ATGCTTTAAA AGATAAGCTT ATTGAACAAG GACATCAAGT AGATCAAATT AATGTTAGGA 2400 

ATCAATTATG GAAGTCGACC TCGTTCAAAG ATTATGATGT TTTAATTCAT ACAGCAGCTT 2460 

TGGTTCACAA CAATTCACCT CAAGCAAGGC TATCTGATTA TATGCAAGTG AATATGTTGC 2520 

25 TGACGAAACA ATTGGCACAA AAGGCTAAAG CTGAAGACGT TAAACAATTT ATTTTTATGA 2580 

GTACTATGGC AGTTTATGGA AAAGAAGGTC ATGTTGGTAA ATCAGATCAA GTTGATACAC 264 0 

AAACACCAAT GAACCCTACG ACCAACTATG GTATTTCCAA AAAGTTCGCT GAACAAGCAT 270 0 

30 TACAAGAATT GATTAGTGAT TCGTTTAAAG TAGCAATTGT GAGACCACCA ATGATTTATG 2760 

GTGCACATTG CCCAGGAAAT TTCCAACGGT TAATGCAATT GTCAAAGCGA TTGCCAATCA 2820 

TTCCCAATAT TAACAATCAG CGCAGTGCAT TATATATTAA ACATCTGACA GCATTTATTG 2880 

ATCAATTAAT ATCATTAGAA GTGACAGGTG TGTACCATCC TCAAGATAGT TTTTACTTTG 2940 

ATACATCGTC AGTAATGTAT GAAATACGTC GCCAATCACA TCGTAAAACG GTATTGATCA 3000 

ACATGCCTTC AATGCTAAAT AAGTATTTTA ATAAGTTGTC GGTCTTTAGA AAATTATTCG 3060 

GCAATTTAAT ATACAGCAAT ACGTTATATG AAAATAATAA TGCACTTGAA ATTATTCCTG 3120 

GAAAAATGTC ACTTGTTATT GCGGACATCA TGGATGAAAC GACAACCAAA GATAAGGCAT 3180 

AAGTCATCTA TTAAATAAAA TCAACATACA AATCGTTTTA TTTGGAGGTT ATAGTATGAA 3240 

GTTAACAGTA GTTGGCTTAG GTTATATTGG 7TTACCAACA TCAATTATGT TTGCAAAACA 3300 

TGGcGTCGAT GTGCTTGGTG TTGATATTAA TCAGCAAACG ATTGATAAGT TACAAAGTGG 3360 

TCAAATTAGT ATTGAAGAAC CTGGATTACA AGAGGTTTAT GAAGAGGTAC TGTCATCGGG 3420 

AAAATTGAAG GTATCTACAA CGCCAGATGC ATCTGATGTT TTTATCATTG CCGTTCCGAC 3480 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 5400 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 5460 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 5640 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAACCAGC 5760 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 5880 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 5940 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 612 0 

25 CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 6180 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 6240 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6300 

30 GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 6360 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 6420 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 6480 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 6540 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 6720 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 684 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6900 

ATAGCCAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

5Q AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA CAAGAGCAAA AGATAAAGAT 7080 



35 



40 



45 



55 



378 



10 



15 



EP0 786 519 A2 

TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 7200 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 7260 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 7320 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7380 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 7440 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 76 BO 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 7740 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7 BOO 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 8 040 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

30 ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 82 BO 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 8340 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 84 00 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 8460 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8520 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 8640 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 8700 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 8760 

SO CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8320 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8330 
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GATGATGTAT AAATCATGGT TAATTACGGA AGCATTAATA TTAACCTGAG AAGCTATAAA 9000 

GAATTATTTT TAAAAGCGAC AATATTAAAT ACGACGCATT TATTTAGGAG TGGCAAACGT 9060 

ATGAATGGGA AAAAGGCGAA TACGATAAAC AGATACAAAT ATTTTCATCA TGTCAATCAT 9120 

CAAAAAATTC AACAAAGTTC TAAAAAGACG CTGTGGGCAT CACTAATCAT CACATTGTTA 9180 

TTTACAGTGA TTGAATTTGT CGGAGGTTTA GTATCTAATt CATTGGCATT ACTGTCAGAT 9240 

TCATTTCATA TGCTTAGTGA TGTATTAGCA CTTGGTTTAT CTATGTTGGC CATTTATTTT 9300 

GCAAGTAAAA AGCCGACTGC ACGATACACA TTTGGATATT TAAGATTTGA GATATTAGCT 9360 

GCATTTTTAA ATGGTTTAGC ATTAATTGTA ATTTCAATCT GGATTTTATA TGAAGCTATT 9420 

GTACGTATTA TTTATCCGCA ACCAATTGAA AGTGGCATTA TGTTTATGAT TGCTAGTATT 9480 

GGTTTACTCG TCAATATTAT TTTGACTGTT ATCCTTGTAA GGTCTTTAAA ACAAGAAGAC 9540 

AATATCAATA TTCAAAGTGC ATTATGGCAT TTCATGGGAG ACTTATTGAA CTCTATTGGT 9600 

GTCATCGTTG CAGTTGTATT GATTTACTTT ACAGGATGGC GCATCATCGA CCCAATCATT 9660 

AGTATTGTAA TTTCACTCAT CATTTTACGT GGTGGTTATA AAATTACGCG TAATGCgTGG 9720 

tTAATTTTAA TGGAAAGTGT GCCTCAACAT TTGGATACTG ATCAAATTAT GGCAGATATT 9780 

AAAAACATAG ATGGCATATT AGATGTACAT GAATTTCATT TGTGGAGTAT TACAACAGAG 984 0 

CATTATTCAT TAAGTGCCCA TGTTGTGTTA GATAAAAAAT ATGAGGGTGA TGATTATCAA 9900 

GCGATTGATC AAGTATCATC ATTGTTGAAA GAAAAATATG GCATTGCACA TTCAACGTTG 9 960 

CAAATTGAAA ACTTGCAATT GAATCCATTA GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

ATAAAACATT GTAGCGCCTA AAACATTAAT CTATGTCATA GGCGCACGTT TCGTTTTATA 10080 

CTTATGTTGC ATCATTTAAA TGATTTTCGT CAATTTCTTT GATG CTATCT ACATCTAACA 10140 

CGACATCTTT AGGTTTCAAA ATATGAATAT GTTTTTCATC ATTTGTATGT AAAATGCGTT 10200 

CTATGATGTA CCTTTGACCG GCCATTGTTT CTACAGCAAT CTTTTTGTTT CTAGCTAAAC 10260 

TTGCTACGAC AGATTCTTTA TCCATAATGA TAGCCCCCTA TATATATGTT TATTTACTTA 10320 

TACCCTAACA TGATTTTTAT ACTCTTTGAA AATATATTTT ACAGAATTTT ATCTAAATAT 10330 

TTAAAAAAAT ATCTTAATAT CCTTGTAATC CGATAAGAAT TATAGTAATA TTTTTTCAAC 1044 0 

CATtGTTATA GGAGGTCTTA TTAATGACAT TATTTTTATT AGAAGCTAAC AATCTTGATT 10500 

TTGCATCAAC GAAAGAAGAA CTAGAAGCAA AGGCAGCATC ACTATCTACG AAGACAATTC 10560 

CAACATTAAT TGAAGTACAA GCTACTGAAA ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

CAAATGACGA aGCAGAAGCT AAACAATTTT TAACAGAAGC AGATATTAGT ATTCAATTAG 10680 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10800 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10860 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10 920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 114 00 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 114 60 

TGGATTTATT CATCCATCGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 11640 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAACGTTA TATTGATGGA 11760 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 11880 

TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

■ 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 12240 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 12600 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 12660 

CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

GTGTAGACAT TATGACGCAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TGTATAAGGA GGCATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 13140 

ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 13200 

TGGAGAAATT TTCCAACATT TAGCAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13260 

CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 1332 0 

CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT 13380 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 13440 

TT TTTT CCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 13500 

AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 13560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

TTTAGTTTCT GG7GAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 13680 

ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAG CAAT A TTCTTTATCG GATTATTTGG 13740 

TTTXATTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13 800 

ATAAGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 138 60 

AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 13920 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13 980 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14040 

AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT ACCGGATTGT CTAATCCGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 14160 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 14280 



55 



362 



EP 0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



TTTAGGAGTC AACGGGTCAG CAACGTATCA AATCACATTG AATCAAGTCG TAGTGCCACA 144 00 

ATCACAAATT ATCACGCATG ATGCGAAGCA GTTTGCGGCA ACTATTCGCC CGCAATTTAT 14460 

TGCTTACCAA ATTCCAATAG GATTAGGCTC AATTAAAAGT TCTTTAGAGT TAATTGATGC 14520 

ATTTTCAAAT GTGCAAAACG GAATAAATCA ATATTTAGAG TATGATGTTG AAGCTTTTAA 14580 

AAAACGTTAT CGTCAACTTA GAGAGGAATA TTATGCAATA TTAGATGACG GTAACTTAAC 14640 

TTCACATTTA AATGAATTAA TATCATTGAA GAAGGACATC GGCTATTTAT TGTTAGATGT 14700 

AAATCAAGCT TCTGTTGTCA ATGGTGGTTC TAGAGCGTAC ACACCATATT CGCCACAAGT 14760 

TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC AGCATTGACA CCGACATTAA GACATTTAGG 14 820 

TAAACTTGAA GCAGAGTTGA AGGGGTAAGT GTGATAAGCT GATTTTTTGT TTAGATGCGT 14 880 

TTGTTGAAAC ATTTTTTAAA ATAATATAAA TCTTAGTTTA TAAACATTTT CTGTTAATTT 14940 

GTTATATCCT TTTAACTAGG AAAATATACA TTTCGTAATA ATAATAATCG TTATCATTGA 15000 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT GTGAACAATT AATGAACTTC TTATTTTAAA 15 060 

GAAGGTGAAT ACTATAGATA CGCATACTAA AGAACAACAA TTCTCGAATC TAGTAAGATC 15120 

TTATCGTAAA GAATACGTGG GTAAAGGACC CAATAGTATT CGAGTGTCGT TTAAAGATAA 15180 

TTGGGCGATT GCACATATGA CAGGTGTTTT GAGTAAAGTT GAGAGTTTTT AC CTAAACGA 15240 

CAAACGCAAT GAATCGATGC TCCATTATAC ACGCACAGAG AAGATTAAAC AGATGTATAA 15300 

AGAAATAGAT GTAAATGAGA TGGAAAGTCT TGTAGGCGCT AAGTTTGTAA AATTATTTAC 15360 

AGATATTGAT TTGAATGATG ATGAAGTCAT TTCAATATTT GTTTTCGATA AGTCAATAGA 15420 

ATAAGTGTTG CTGGTGTAAG GTACACGGTG CTGTTTGCTA ACTTCGCTTT GAATTTAACA 154 80 

ATAATTCAAG GGGGTGGTAT GTCAAACGGT GCCGTTTTTT TGTCATATTT TTAAAACAAG 15540 

CAACATGCAA CACGTACTTT AAGGAAGTCA AAATTTATCA TTTAGGAGAG ATGGATATGA 15600 

AAATCGTAGC ATTATTTCCA GAAGCAGTAG AAGGTCAAGA AAATCAATTA CTTAATACTA 15660 

AAAAAGCATT AGGATTAAAA ACATTTTTAG AGGAAAGAGG ACATGAGTTC ATTATATTAG 15720 

CAGATAATGG TGAAGACTTA GATAAACATT TACCAGATAT GGATGTGATT ATTAGTGCGC 15780 

CATTTTATCC TGCATATATG ACTCGTGAAC GTATTGAAAA AGCACCGAAC TTGAAATTAG 1S840 

CAATTACAGC AGGTGTAGGA TCTGACCATG TAGATTTAGC GGCAGCAAGT GAACACAATA 15900 

TTGGTGTCGT TGAAGTTACA GGAAGTAATA CAGTTAGTGT GGCAGAACAT GCGGTTATGG 15960 

ATTTATTAAT ACTTCTTAGA AACTATGAAG AAGGTCATCG TCAATCAGTA GAAGGTGAAT 16020 

GGAACTTGTC TCAAGTAGGT AATCATGCGC ATGAATTACA ACACAAAACA ATTGGTATTT 16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

5 ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 16380 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 16440 

10 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16S00 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 165 60 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

15 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16800 

20 

TGAnAAATnT CATTCATGTG GnAATC 16826 

(2) INFORMATION FOR SEQ ID NO: 47: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47: 
TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TATAAAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 180 
AGCTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 240 

40 ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 
ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

45 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 480 
AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 540 
CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

50 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 780 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 840 

TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 900 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

CTTCACCATA TTCAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 1140 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

ATATCAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 1380 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 1440 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 1560 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 16 8*0 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 1740 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 1800 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1860 
35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACGCTATGT * 1920 

TATMTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 2040 

GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACGCCGAAA TCGTTCAAAA 2220 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 2340 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG AAAGTAAATC AAGAATTGAA TGAACTTCAT 2460 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 2580 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 2640 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2820 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 2930 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 2940 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 30 60 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGC7AA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 3240 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGATCATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 3480 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3540 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

35 AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 3720 

CAACAAGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 3780 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AATATTCCAT TAACGTGGCA 3840 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 
(2) INFORMATION FOR SEQ ID NO: 48: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANOEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
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CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 
AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 
TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 
ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 
CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 
ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 
TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 
GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 
CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 
TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 
TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 
GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 
GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 
TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 
AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 
TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 
AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 
TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 
AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 
TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 
ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 
ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 
AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 
CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 
CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 
AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 
AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 
GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1300 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 1860 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2340 

CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 2400 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 2460 

GGTGTAGAA? TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 2640 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2880 

TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AtCTAGTTAG ACAGCACTTT 2940 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3 000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3060 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 3240 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT 
CGCAATTGAA ACGTACATAC CTCAACTGAA 

■ 

5 AAATGCTGAA AATGCAGCAC ATGGTAAAGG 

AAGAAATGGT GTAGATTTCA TGACTATGGG 
TGATTTTATA GATGAAGCAA AACGACTAGT 

TO 

GGGAATTGGT ATGAGATTTA TACAAATTAA 
AGGAAGAGCG TTTATGCCAG ATATTGATGA 
GGAAGCACAA GAACAAACTC CGTTTATATT 

15 

AAAGTATGCA ATGGGATGGC ATTTAGATGG 
CACATTCAAA CAGCAGATGA ACGTATTTTA 
GGTATGACAG GTTTTTATGA TGGCATTTTA 

20 

TTTATCACTA GTTTG CCACA AAGACATGTT 
GGTGTTGTTA TTGATTTAGA CAAAGAAGGT 

25 AATGATGACC ATCCATTTTC AACATTTTAA 

CTATCGTCCA TTAGTATGAA TTTAATATAG 
CTTTTTGTTA TCATTTAATA TGAAATATAT 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG 

CGCTACGGCT ATGAATAGAA AAGGATATTA 
TATCAAAGGT GGACATACGA ATAATAAAAT 

25 TAGTGATGAT TTAGATATTT TGATTGCATT 

TGAAATGAGA GAAGACAGTA TTATTTTArC 
AGGATGTCAT GCACAGCTTA TTGAATTACC 
AGCATTAATG AAAAACATGG TTGCAATAGG 
AAATACATTT GAAGAACTTA TTACTAATAT 
AGTCAATATC CAAGCATTAA ACGAAGGTTA 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA 
CGATGCCATT GGATTAGGTG CAATTGCTGC 
TACACCTGCG TCTGAAGTTA TGGAATATAT 

50 

GGTTATTCAA ACAGAAGATG AAATTGCTGC 



AGGGGATATC GTAGGTAAAA TTGGACGAGA 3600 

GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

TTTGACTGAA AAAATATATA AACAATTACT 372 0 

TAATCACACA TATGGTCAAC GTGAAATTTA 3780 

AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3840 

TGATATTAAA CTTGCAGTTA TTAATCTGCA 3900 

TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3960 

TGTTGATTTT CATGCAGAAA CAACTTCTGA 4020 

TAGAsTAGCG CTGTTGTTGG AACGCATACA 4080 

CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

GTTCCAAATG AAGGTAGAAG TGTATTATCT 4260 

AAAACAAAGC ACATCGAACG TATATTGATA 4320 

AATTACGTAA GTAAACATTC GAATTGGACC 4380 

TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CCATAGGAGG CATATAACTA TGAAACCACA 4500 

CGAAGGTATT GAATCAACTG GGGAAATCTT 4560 

TTTATATGGA TATAGACATT TTTCAAGTCG 4620 

TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4680 

TGACCAAGAA ACAATTGATG TTAACCATCA 4 740 

TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 800 

TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4860 

TGCTACTAGC GCATTGATGA ATTTGAATAC 4920 

GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4980 

TCAATTAATG CAATCTCGCT TACCTGAAA? 5040 

TGCACTACCA CATCTATATA TGATTGGTAA 5100 

AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 5400 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 5580 

10 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 5540 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 5700 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

15 AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5830 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

25 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 6300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 6480 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATaSgTATG GGGCATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG CAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 

55 



390 



EP 0 786 519 A2 



10 



TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 7200 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7260 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7320 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 73 80 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 7440 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7500 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7560 

*5 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 7680 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 7740 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7778 
(2) INFORMATION FOR SBQ ID NO: 49: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 60 

TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 240 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 300 

CCATAATAGC CTGGAATGAT ATTCATATCA TTTAACCATT TGATAAAACG AGATGAAGTC 360 

45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 420 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 540 

GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 600 

GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 660 

GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 
AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 
AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 
CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 
G7AATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 
ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: $252 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 
3ATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 
3CGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 
PlACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 
TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 
ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 
rTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 
ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 
MTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 
ITAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 
ITAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 
rGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 
AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 
SAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 
CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 
3ATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 
TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 
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300 
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840 
900 
960 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 1140 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTTCGTTC 1200 

5 

ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGCGAGA GGTACATACT 1260 

TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 1320 

w GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 1380 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 1440 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 1500 

15 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 1740 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

25 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2040 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

3 * TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 264 0 

50 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 
AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 
GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 
ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 
CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 
TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 
ACCGACGCCA GCAACGCCGA ATGAACTAAT AAT CACGACA GCGATTAACG TTACAATAAA 
TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 
AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 
ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 
GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 
TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 
AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 
TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 
AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 
AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 
GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 
TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 
CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 
AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 
ATGASACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 

* 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 
ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 
TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 
GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 
TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 
GACTAGTATG CT 

(2) INFORMATION FOR SSQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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{C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 60 

GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 180 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 240 

15 ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 300 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 420 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 480 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 540 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 600 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 660 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 720 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGtAGCTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 840 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACfTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 1320 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 1380 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 1560 
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TGAATCAGGA 
AGTTAAAACG 
TTGCTTTTAT 
GAAATCTTCC 
AAGATGCAGA 
GTACTGATCA 
TGTTTATCGG 
TTTCTGGATT 
CATCTATTCC 
GGACAATTAT 
GAGAATTTTT 
CAAAATTCAA 
CATCAATGTT 
TAGGTGTACC 
TATTAATTTA 
TCTTTTACTT 
AAGGGGGCAT 
GATATTACAG 
GAAACATTGG 
ACAAAATTAT 
GAAGATTTAG 
ATGATCTTTC 
ATGGAACCAT 
GAAATACTAA 
CAATTTTCAG 
AAAGTGCTCA 
TTAGATTTAA 
GATTTAGGGG 
GTTGAAACAG 



ACCTGAAATG 
AAATAAGTTA 
CGGTCCAGTT 
GGCAAAAATA 
TGGCAAGGAT 
GTTGGGTCGA 
TGTTGTTGCA 
CTTCGGTGGA 
GAATTTAATT 
ATTGGCTATG 
AAAATTAAAA 
ATTGATATTT 
TACAGTACCT 
CGCACCTCAA 
TCCACATGAA 
ATTTAGTGAT 
AGCATATGAC 
CAGGGGAAGT 
CAATTGTTGG 
TCCAAGGGGA 
CAAAAAAACC 
AAGATCCAAT 
TAATTAAGCA 
ATCTTGTAGG 
GTGGACAAAG 
TTGCTGATGA 
TGAAAGAACT 
TTGTTGCGAA 
GAGATGTTAA 



CAACGAGAAA 
GCTGTTGTCG 
ATAAATAAAC 
CCTGTATTAG 
GCTTATAAAG 
GATTTATGGA 
GCGATGTTAG 
CGTGTCGATA 
GTCGTAATTT 
TCTATCACAG 
AATCAAGAGT 
AAGCATATTT 
AGTGCTATTT 
ACATCGTTAG 
TTATTTATAC 
GGATTACGTG 
TGAAAGAATA 
GCAGGCAGTG 
TGAATCAGGT 
CACAGGAAGA 
TGAAAATGAG 
GACATCTTTA 
CAAAAATTAT 
TTTACCAAAT 
GCAAAGAATT 
ACCAACGACT 
ACAAGAAAAA 
TATTGCTGAT 
CGAAATATTT 



GCAAAAACTT 
GTATGATAGG 
ATGATTATGC 
ACAAAGTTCC 
CAGCAAATGC 
CAAGAACATG 
ATATTTTTAT 
CGATTATGCA 
TATTTGTATT 
GCTGGTTAGG 
TTGTCATGGC 
TACCTAATAC 
TCTTCGAAGC 
GGTCATTAGT 
CAGCAATGAT 
ATGCATTTGA 
TTAGAAGTAA 
AGAGGCGTAG 
TCAGGTAAAT 
ATTAAAAAGG 
TTGATTAAAT 
AACCCAACGA 
AGTAAAGCAC 
GCAGAAAAAA 
GTTATTGCAA 
GCATTAGACG 
ATCGATACAG 
AGAGTGGCAG 
TATGATCCAA 



TTGGCAAGAT 
TTTAATTATC 
TGAACAAAAT 
ATTTTTACCT 
TAAAGAAAAT 
GAAAGGTGCT 
TGGTGTTGTA 
ACGTATACTT 
AATTTTTGAA 
CATGAGCAGA 
TTCGAAAACA 
ATTAGGTGCT 
ATTTTTAAGT 
AAATGATGGG 
TTTAAGTTTA 
TCCGAAAATG 
ATGATTTGCA 
ATTTTTATTT 
CTGTAACAAC 
GAGAAATTTT 
TACGTGGCAA 
TGCAAATTGG 
AAGCTAAAAA 
GATTTAAAGC 
CCGCATTAGC 
TAACGATGCA 
CAATTATTTT 
TTATGTATGG 
AGCATCCATA 



GCTTGGGCTC 
ATTGTAATAT 
GTAGAACATA 
TTTGATGGTA 
TATTGGTTTG 
CAAATTTCAT 
TATGGTGCGA 
GAAGTCATAG 
CCATCCATTT 
GTTGTACGTG 
TTGGGGGCTT 
ATCGTGGTTA 
TTCATTGGTA 
CGCGCAATGT 
TTAATTCTAT 
CGTAAATAAA 
TGTTTCCTTT 
GAACAAAGGG 
AAAAGCAATT 
ATTTTTAGGG 
AGATATTTCA 
TAAACAAGTC 
GCGCGCATTG 
ATATCCTCAT 
TTGTGAACCT 
GGCACAAATT 
TATAACGCAT 
TGGTCAAATG 
TACATGGGGA 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 80 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 3540 

5 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

1Q CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3840 

15 TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4020 

20 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 4140 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4200 

25 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4320 

30 ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 4380 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4440 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4500 

35 ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4560 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4620 

ATGdAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCAT7AA 4680 

40 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4800 

TGAcTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4860 

4o 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4 920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 980 

50 GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TC CAT AT ATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 5280 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

XAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 54 00 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 54 60 

AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 5640 

15 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 5940 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 6240 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6300 

GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAA/ETTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 6480 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

CGGAAACCTC AAAACA7AAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 
50 (2) INFORMATION FOR SSQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
£5 {C) STRANDEDNESS : double 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 180 

TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 240 

ATTTGTATAG Qpj^pj^p^ AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 300 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 360 

TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 420 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 4 80 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 540 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 600 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 660 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 720 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAGCAAATAA AGGGATCTTA GATACTGAGC 840 

30 AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGfiACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

i 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 1440 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

50 TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 1740 

ATAATACGGT TGTATTATCG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 1920 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2040 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 2100 

15 GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT CGAGCGTGTT CAACAACTGc 2150 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 22 30 

20 ATCATCACCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 234 0 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 2400 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 2520 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 2760 

35 TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2820 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTCCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 2940 

* 

40 AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3000 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3060 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 3120 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3240 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3300 

GGAACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3420 

55 



30 



45 



50 



401 



EP0 786 519 A2 





TTAAATGAAG GGGTAGCACC 


TGGTGTACCT TGGAAGAATG 


GACCGGTTCC 


AGTAGATAAA 


3540 




GCGATTTATG GATTAGGCCC 


CATTGAAATT AAAGTAAGTT 


ATTTTGACGA 


CTTTAAAAAT 


3600 


5 


ATTTTAGAGA CTGTTTACGG 


TATGACAACT ATTGCGCATG 


AAGATAATGT 


CGCATTACTT 


3660 




GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 


3720 


IV 


GCaGCACGTC AAGGTTATGG 


tGAGGTACAT CATGTGTCAT 


TTCGTGTGAA AGATCATGAT 


3780 


GCAATAGAAG CGTGGGCAAC 


GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT 


3840 




AATCGTTTCT ATTTTGAAGC 


ATTATATGCA CGTGTGGGGC 


ATATTTTAAT 


AGAAATTTCA 


3900 


15 


ACAGATGGAC CAGGATTTAT 


GGAAGATGAA CCTTATGAAA 


CATTAGGCGA 


AGGGTTATCC 


3960 




TTACCACCAT TTTTAGAAAA 


TAAAAGAGAA TATATTGAAT 


CGGAAGTTAG 


ACCTTTTAAT 


4020 




ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 


4080 


20 


GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 


4140 




GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG 


TATTAGAGGA 


4200 




CAAGTTTCAG AAAATGGGAT 


GAACCGTTAT TTCAAACGTC 


TTGGTGAAGG 


TGTTTATGAT 


4260 


25 


GAAGAAGATT TGGCATTTCG 


TGGACAAGAA TTGTTGACGT 


TCATTAAAGA 


AGCTGCTGaA 


4320 




CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT 


TTTCAAATGG 


ATCAAATATA 


4330 


30 


GCGATTAACT TAATGTTGCG 


TTCAGAAGCA CCATTTAAAA 


AAGCATTGTT 


ATATGCACCG 


4440 


TTATACCCAG TTGAAGTAAC 


GTCAACAAAG GATTTATCAG 


ATGTCAGTGT 


GTTGCTTTCT 


4500 




ATGGGGAAAC ATGATCCAAT 


TGTGCCATTA GCTGCAAGTG 


AACAAGTCAT 


TAACTTGTTT 


4560 


35 


AATACACGTG GGGCACAAGT 


CGAAGAAGTT TGGGTGAAGG 


GCCATGAAAT 


TACAGAAACT 


4620 




GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT 


TCTATTAAGA 
TTTGAAGTGC 


AGCGGACAGA 
TGTACTAAAT 


4630 
4740 




TGGAAAAGAT TTTTACTTTT 


CATCTGCCCG LI i 1 1 1 iviAT 


40 


TTTACAATAG TATAGATATT 


TTAATCGATA TGAGATTTGC 


CGGTAATACG 


CTTAATTAAA 


4800 




CCTTTATAGA GTACAGGTAT 


GAGTAAGATG AAACCGAACA 


ATCCCATAAT 


AGGGAATACT 


4860 




TTTCCAATTA ATGAAATGAa 


ACCGATAAAT GTACTAATAT 


AAGTGATGAC 


AGCCATTGTA 


4920 


45 


ATAATAATGA TGAAGTAACG 


TCTGCTGAAT GGAACGCTGA 


AACGTGACGC 


AAATGCATAC 


4980 




ATTAATCCAA CAACAGTATT 


GTAGATGACA AGTATCATAA 


TGACAGACAT 


AATAATACCA 


5040 


SO 


ATTGACGGAG ACATTTGTGT 


CGCTAATTTT AATGTAGGTA 


GATCTACGTG 


TTTAAlTI-i'A 


5100 


TCGAATTGAG AAATTAAACC 


TAGATTAATC ATCATGAGTA 


AAAATGTAAT 


GATTAAACCG 


5160 




CCAATCAAGC CCCCGTATAA 


CGTTGAGTCA CGATATTTAA 


ctttactacc 


CATCACTGAT 


5220 
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CCAGGTGATA ATGATTTCTG CTTATGAATC TGAGCATCAT TATTAGCGGC AGTAAAATCA S3 40 

AGATGACTTG TTGTGAAATA GTAGACCGCA ATCATAATGA CAATCGCAAT TAAAAATGGG 5400 

GTAACACCGC CAAGCACAGC AATTAAACGA TCGAATTTTA GAAACAGTGT TGCTAAAATA 5460 

AAGGCGACTA ATATGAGTGC GCTCAGCCAA TACGGTAAGT TGAAACTTTG ATGAATGGTT 5520 

GACGCACCAC CTGCAGTCAT AATAATAGCT AAAGACAACA TAAACATTGT TAAAATAATA 5580 

TCAAAACCTC TTGCAATAGA GGGGTATAAG AAATAGTTAA TTGAATCAGA ATGATTTCTG 5640 

GACTTTAGAT GATGACCTGT ATGCATGACA ACCATTCCAC CTAAAGTAAT CAATAGTCCT 5700 

75 GTTACAATAA TGCCTGAAAT GCTATATGCG CCATGACTTG TGAAAAACTG GAAAATTTCT 5760 

TGACCAGTAG CAAAGCCGGC ACCAACGACA ACACCAACAA AGGCAAATGC CACAATAATG 5820 

GACTCTTTTA AGATACGCAT GATTTAAAAA TGTCCCTTCG TAATTTTAAG TAATATAGAA 5830 

20 AATGTAACAT ACATGTTAAT GAAAAATATA GTACTAATAT AGTATTTTGT TAAATTGGAG 5940 

TAGAAGCGAG GGTGTCGGTC ATTTCATTAA TTTATTAGTT GATTTTGCAT TTTTTTGCTG 6000 

TAAAGTTGTT ATAATACAGT TAACAGGAAT TAGCATAGAT ACACCAATCC CCTCACTACT 6 060 

CGCAATAGTG AGGGGATTTT TTTCGGTGTA GCTAGGTCGC CTATTTATCA TCGTGTTTGC 6120 

GTAgCaATGC GTAAACACAG TACCACTAAA TAAGTGCACG ATACATGCAT CAAATGTCGT 618 0 

CTTTAGTcTA AGTAACGATC ATGCATTAAC ATTTTCAAAA TATCTATTTG AGCTTGAAGA 624 0 

TCTTTACCAA TATTGGTATC ACGAATCTTC TTACGTTGTA ATTCTTTATC TACGACGCGC 6300 

TTTATAGAAA GTTCATCGAT ACCTTCGGAA AGTATTTTTn CTTTAGCGTT AAATTGTTGG 6360 

TGTGCAACGA GTTGCATACC GAATGAATTA TACAATAGTG TATAGCCTGC AATGCCAGTn 6420 

GTTGACTGAT AAGCTTTTGA AAAGCCACCA TCAATGACAA GCATCTTTCC ATCAGCCTTG 6480 

AT - S482 
40 (2)* INFORMATION FOR SEQ ID NO: S3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base. pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

{D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 
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AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 
TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 
ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 
ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 
ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 
CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 
GACAT1CWT1TA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 
GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 
AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 
TTAAAAGGTA ATCGACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 
GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 
AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 
TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 
ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 
TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 
GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 
TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 
GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 
CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 
GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 
AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 
CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 
GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 
CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 
ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 
CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 
GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 
GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 
GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 



240 
300 
350 
420 
430 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
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TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 204 0 

GGTG7ATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

5 AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 2160 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 2220 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 22 80 

10 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 2340 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGGCGTAT 2400 

CACCTATTCA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 2460 

To 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2520 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 2640 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2760 

25 AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2820 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2940 

30 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 30 00 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

35 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAGTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 3240 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 3300 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 3360 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

4 ° AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 3480 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 3540 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 3600 

50 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 3660 

ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT 


TATACGAATT 


TACATCAAAG 


TGAGAATCAC 


CACTATTCCA 


3840 




CAGTCCCAAA TGTTCAAATT 


GGAATATCAT 


ATTAAATTTA 


CCATTTTCTT 


CCCCGACCCA 


3900 


5 


GTCATCAGCA TCATCAGGGC 


TTACACCATT 


CGCTTCACCA 


ACAGTCATAA 


TGTCATACTT 


3950 




ACTTAATGAG CGATCTTTCA 


TCTCTTGTAA 


CCAAGTTTGT 


ATACCTGGCT 


GATTCATATC 


4020 


10 


TACATCAAAT GCTGGGGCAT 


ATGTTTTACC 


CTCAGGTACA 


GGTAAGTCAC 


CCGCTTCAAA 


4030 


CGTCTTCTTA ATATGCGTAA 


TTGCATCTAC 


TCTAAATCCA 


TCAATGCCTT 


TATCAAACCA 


4140 




CCAGTTCATC ATTTCAAATA 


CAGCATCTCT 


AACTTCCGGA 


TTACCCCAAT 


TCAAATCAGG 


4200 


15 


TTGTTTTTTA CTGAATAAAT 


GGAAATAATA 


TTGCTCAGTA 


TTAGCATCAT 


ATTCCCATGT 


4260 


AGATCCATTA AATATACTTT 


CCCAGTTGTT 


AGGTTCAGAG 


CCATCTGGCT 


TTGGATCTTG 


4320 




CCAAATGTAC CAATCACGTT 


TGGGATTGTC 


TTTACTAGAT 


TTGGATTCTA 


TAAACCAAGG 


43B0 


20 


ATGTTCATCA QATGTATGAT 


TTACAACTAA 


ATCTAAAATA 


AGCTTCATGC 


CTCTATCATG 


4440 




AACACCTTTT AATAAACGAT 


CAAAGTCTTC 


CATCGTTCCA 


AATTCATCCA 


TAATCTCTTG 


4500 




GTAGTCACTA ATATCATAAC 


CATTGTCATC 


ATTAGGTGAT 


TTAAACATTG 


GACTGAGCCA 


4560 


25 


AATGACATCG ATACCGAAAT 


CTTTTAAGTA 


GTCCAATTTA 


TCAATCATTC 


CAGGTAAATC 


4620 




CCCAATACCA TCGTGATTAC 


TATCATTAAA 


ACT7CTTGGA 


TATACTTGAT 


ATGCTACTGC 


4680 




TTCTTTCCAC CATTGCTTAT 


TCATTTTAAA 


ACTCCTTTGC 


TATCGCTGTG 


TTGATTTTCT 


4740 


30 


TATTTTTAAT TCTGTATCTA 


TAATGACGAG 


TTCAATAACA 


TCCTGTGCTT 


TGTTTTTCAA 


4800 




TATATTTAAA ATTGCTGCAC 


CAGCCTGTTG 


ACCTAACATT 


CGAGGCTTGA 


TGTCAATACA 


4860 


35 


GGTTTGTGGT GGTGACGCAA 


TTTCGGTTAA 


ATAAGAATCA 


TTGAACGTTG 


CTGTCATTAC 


4920 


ATCTTTCGGA ATTTCAATAT 


TAAGTTCATA TAGGACACTT 


AAAATCGCTA 


AATGTAACAT 


4980 




AGCATCTAAC GAAATGATTG 

* 


CCTGTTTAAT 


ATTTGGGTCC 


TTCAAACGCG 


TATGTAGATT 


5040 


40 


TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC 


TCAATAATTT 


GATAATTAAT 


5100 




TT7ATTTTGA GAAGCTATCG 


TTTCAAATCC 


TTGAATTCTA 


TCTTTTGAAA 


CTTCAAAATT 


5160 




TCCTTTTTCT GTAATAAATA TTAATTCATC 


TACACCTTGT 


TCAATAACAT 


GTCGTGTCAA 


5220 


45 


ATTTTCAGAA GCTAATATAT 


TATCATTATC 


TATATGTGTA 


AATTGATGAT 


CTATATCCGA 


5280 




TGTAGGCTTA CCAATCACAA 


TAAATGGCAT 


GCTTTCATCA 


ATTAACATTT 


GTTTAATCGG 


5340 




ATCATTTTCT TTTGAATAGA 


GCAGTATAAA 


CGCATCAACC 


ATTCGTTGTT 


T AAT CATTTT 


5400 


SO 


ATAAACTTCA TCCATTAAAT 


CATTCATATT 


ATTTGAGACT 


GTCGTTTGTG 


TACCATAGCC 


5460 




ATGCTGGTTA CACGTTTCAG 


AAATTCCTAG 


CAATACATTG 


ATGTAGAATG 


GATTCAGTCG 


5520 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5640 

AATTGTCGCT TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 5760 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 5820 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 58 80 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 5940 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT SO 00 

CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

20 CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 6240 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 6300 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6360 

ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6420 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 6480 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA TATTATTCTT CAATCCATTG 6540 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG TAACTATCAT AGTAATTAAT 6660 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CAACATCACG 6720 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAATCCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 6840 

40 GCAAfcCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6960 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC 7020 

ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 7140 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT 7CAACTAATG CTTCTTTATT 7200 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7260 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA GACATGCCTA ATAATTCTTT 7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 7500 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 7630 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7800 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGC7 AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 7980 

CATACACTAC ACTAAATCAT TTCGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8040 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

25 AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 8220 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 8280 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT 8340 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 8400 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8640 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 3820 

45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 8880 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 

TATTCATTG? ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 9000 

CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 9060 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 
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GATGAAATAA AATGTTACAG TAATTGACGT 
TCATATTATT TTCAATTTAT TATATATAAT 
5 ATATATTGAT TGATTTTTCA AAGATATCGT 

CTCATGTTTT TTATTATATT CCTTTATGAT 
GTTTTATATC TTATGTCGTA ATTATTAATA 

10 

TGCAAAATAA AGTTTTAAGA ATTATCATTA 
TGTTATTAAC GAGTATCATT CCAATTTTAT 
ATACAATTAA AAATCCATCC ATTATAAACG 

15 

ATTACTTTCA AACATGGGTC ATACGGATGG 
TTCAATGAAC CAAATTGCGA TTTGATTTGT 

20 ATAATACTTG CTCTCGAGTT AAGCGCTTTG 

TTCAAACGCG TCTCATACAA ATTGTGTAAA 
TTTAGACCAA TTTCTTTCAG CAGTGACGCA 

25 ATACTTTCGA TGAGCGGTTT CATTCTCATT 

AAATAATGAT AGTATGAATT TTCGTTTCTA 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT 

30 

TCATATTTTT CACAAATATA ACTATATTTA 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT 

GCAATGCTTA GTCCAATTAA CAGTAATAAT 

35 

GATTTTGCAT TAAAAACATG AAGTAATATA 
TTTAATACGA CAGTTAATGG TA7AAATAAC 
TGACCTAATA AACTAAATAT TGCTGAACCT 

40 

CTTGAAATAA TCGCTTGTAG CGAATGTACT 
ATGGCGCTTG AAGCATAATT ATCTAAACCT 
45 ATACCCACTG CTGTTTTTAT TGTTCTAAAT 

GGTTAGCGCC TCTTATCTTT CTTCACAATA 

TACGTTCATC ACATCATGAC CTTCGATTTG 

SO 

ATCTTTTACT AATGCAAATG ACGGACTTGA 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC 

55 



TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

TATTGTAACT CAAACTAAGC TTTGTCAAAA 9300 

ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

GATTGCTAGT TATATCGTCT CAAGTTAAAA 9420 

CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

AAACTATATC TCAACTACCT ATACAAAATC 9600 

CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

AAATATTCTT CTAATTCATT TAATATTTGA 9780 

TGTGTTGTTG GCAATGGCAG TTCATCCAAT 9840 

CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

AACACTTCAA ATTCAATCGC AATGGTATCT 1014*0 

CTAAAAATAT CAGCAATTTG TTGCTCAATT 102 00 

AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GTTTCATTAA CAATTAAATG TGCATCAATT 10320 

ACGCAACTCG TAATGACACC TTCTTGTACT 10 380 

AATACGATAA TACCGAGTAC AATTGGACTC 10440 

AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

TTTGTATGTT TAATACATAA TACGACTAAT 10560 

AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10 860 

AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC ATAACTAGTG TTGTACCATC TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AGTAGTTAAT TGCTCATATC CCGCAGATTC AATTTCATTC CTTGCTTGTT CTACAACACC 11100 

GTTCATGTAT AAATCGAAAT TCATGnCCAT AAGTTCAATC ACCTATCCCT TTATATTTAA 11160 

ACTAtCCTCA TTCTACTAAT TAATAACATA TTGTTCAATA AACTAATCTG AATCACACCT 11220 

ATATTTAGAC ACAATTTTAA CAATATACCA AACATTATTG TGCTTAAAAT CATGGTAACT 112 30 

AATTTGTTCA CATGTTTTCA TTAATATGTT TCAAGTATGA TGTCTTATTT TGACTTTACT 11340 

GCAAAAATGC ATTCAACCAT GTTGATTATT GTTCTTTATC TTTTTTGAAT ATATTGCACA 114 00 

TATTTTAGTG CCAAAAAATA ATACATCCAT CGACAAGAAC AAGATAAAAC AAGTTGTCGA 11460 

TAGATGCATC TATGTTATCA CTAATATATA TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

CGCTGTTTAA TATGATTCAT ArATTTACCT GTTTGTAAAC CATCTAAAAT ACGATGATCA 11580 

ATTGAAATAC ATAAATTAAC CATGTTACGA ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

TTTTTAACGA TTGATTCTAC TTGTAAAATC GCTGCTTGTG GATGATTTAT AATACCCATT 11700 

GATGATACTG AACCAAATGT ACCAGTATTA TTTACCGTAA ATGTACCGCC CTGCATATCT 11760 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC GTTGCTAAAG TATTAATTTC TCTAGCTATA 11820 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA ATCACAGGTA CGTATAATTT ATTTTCATCA 11880 

GCAACAGCAA TTGAAATATT AATGTCTTTA TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

CTATTTAATA AAGGATATGC TTTTAAAGCA TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12000 

AACGTTAGAT TATATCCTTC TTTATT7TTA AAGCTGTTTT TATAATGATT TCTCGTATTC 12060 

ACAAGATTTG TAGCATCTAC TTCAATCATC ATCCATGCAT GTGGAATCTC TGTTACACTA 12120 

TTAACCATAT TTTGCGCAAT TGCTTTACGC ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

CTATTGTCTT CAGATGATTG GTTACTTGAT GTATCTACTG ATGTTGATTT TGTTTGAACT 12240 

TGTTTGTCAG ATTGAGCTG7 GGTACCACCA TTTTCAATAA CTGACATTAT ATCCTTCTTA 12300 

GTTACACGAC CTTCAAATCC ACTACCTACA ACTTGTGATA AATCAATGTC ATGCTCTGAA 12360 

GCGAGTTTAA ATACAACAGG TGAAAAGCGA CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

GTAGATGTCT GTTCCACTGT TGCACTAGCT TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

ACTTTTGCTT GTATCTCTTC AGTTGTTTCA TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CAGATAATTG TATCAATAGC TACTGTCTGC CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC ACTTTATCTG TAATAACTTC ACATAATGGT 12560 

TCATATTCAT CAATATGATC ACCAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12 840 

GGAGAAAATG GCATAGATGG TACAtCTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12900 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACCTTCTAAA 12960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTATGTTTAG CACGATCAAT AATTGTTTCT 13020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13080 

AAAATATCCG CTGCTTGTAA ACAATAATTG ACCATTAATC CATAACAAAA TACTGTTAAA 13140 

TCTTCACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 13200 

ACTTCTTCCT TTAAGAAACG ATAAGCTTTT TTATGCTCAA AGTACAATAC TGGATCATTT 13260 

GATTCGATAG ATGATAATAA AAGCCCTTTA GCATCATACG GTGTGGAAGG AATAACAATT 13320 

GTTAAACCTG GCGATGAAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCCTCCG 13380 

TGAACACCGC CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 13440 

CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13S00 

GCAAATTGAA TTTCTGCAAT TGGTCTTTTA CCXACCATAG CTGCACCAAT GGCAGTTCCA 13560 

ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCACCATA TTTTTGTTGC 13620 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 13 680 

ACATCTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGCCTCT AAATAAGATA 13740 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC GTACACAAAT GCATAGGCTT CTTCGACACT 13 800 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13 920 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC GATATTGGTC 13980 

GTCATCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 14040 

TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 14100 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG ATAATTTTTC 14160 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14220 

TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT 14230 

TGAGCTACCT TCACCAACAG TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14340 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 14400 

ATTCTTAGCT CTACTACTAA AGTGTGATGG CATTTGTTTT CCACCAGAGT TAACATCGTC 14460 

TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG ATACCCATAT AAGTAACGAA 14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14750 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14320 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 14330 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GGCTATCAAT 15000 

CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 15180 

AACACCTTCT TTTGATCCAA CATGTGCCAA TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT ATATCATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 15540 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 

GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16020 

CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 

TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTA7CGCA GCAACATATC CTGCAGTACC 16140 

TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 

TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 
AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 
ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 
TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 
TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 
TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 
ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 
TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 
AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 
TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 
AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 
ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 
CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 
ACACTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 

4 

ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 
CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 
GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 
TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 
CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 
TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 
ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 
TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
950 
1020 
1080 
1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 1320 

ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 1440 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 15 GO 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 1620 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 1800 

20 TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2040 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCAGGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

AAATCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 2460 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TTAGACTCAG TAGTAACCTG ACCACCACCT 2820 

50 GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3050 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

5 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 3240 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3300 

10 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3360 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

15 CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 3480 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3540 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

20 ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3660 

ATTTTGGTTA CC TTTTT GCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

25 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3840 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3950 

30 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4080 

„ TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 414 0 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200 

AAATCAACTC GCGAAgTTTC TAC TTTT GGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4380 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

45 TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

SO 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 
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TACTAAACCA TACATAATAA TCGCCTGTAC 
TTTCATTGTA TTAATAACGT CACTATAATT 
5 ATGATTATTT ATACAAAAAC AGCCGTATTT 
GCATCTAGTT AATAATTGCA TTTATCAAAT 
CACAAACTAG TTTAAAATTC TAACTTTATC 

10 

ACCAATGAAG CAATCAGAAA ACACTCTAAT 
AAACAGGCTT ACTTCATATA ATTTATGAAA 
TTGTGATTCT TTTTATTTCT GCGTAATAAT 

15 

AACAACATAC CTTTGTTTGT TGATTCTTCT 
GATTGTGGTT TTTTAGTTGG TGCCACTGCT 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA 
GGTACTGGTT TACCAGGTTC AGCTGGTACC 
TCACTCGGCA CTTCTGGTGT CGGTGGTGTT 

25 GTTGGTGGCG TTGGTGTTTC CGGCTCACTT 
ACGATTGGAG GTGTTGTATC TTCTTCAATC 
TTTGGAAGTG TATCTTCTTC AAAGTCAACA 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT 
GGCACACTGT CGAAGTCGAT ATCAATGATG 
TTTTCTGTAT CTTCCTCGAA TGACTGATTA 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA 
GTGTCTTCCT CGAATGACTG GTTACCGCTA 

40 TTAATATCAA CGTGGCTATT TTCTTCGATT 
TTTTCAGTTC CTAAACCAGA ATGAGAAATA 
GGTCCTTGTG CTTGACCATG CTCTTCAGGT 

45 tCAGTTGTAT ATTCTTTCGT ATCTTCAACT 
ATACCTTTTG TAGACTCTTC GTCAAATTCA 

CCACCTGGGT TTGTATCTTC TTCATATTCA 

SO 

TGTGTAGATT CTTCAAAGTC AATTGGATTT 

ACGTGACCTG CtTCGCTATC CACAGCAGTA 

55 



AATGCATCAT TAACAAGTCA CTGAAACGCC 48 60 

TTTATATCGT TCGGTTTTTG TTTGATTTTA 4920 

CAAGCCGACA TTTTAAATTT AACTAAATTT 4980 

TTGTCTTATT GATCCAATCT AATTTGTACT 5040 

TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

GCTAAACCTA GAATGCTGAA TAATCCGCCG 52 80 

CCACCTGTTT CAGGTAGTTC AGATTTCTTA 5340 

TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 5640 

GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 

TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

CCTTTATTTT GACCATGAAT TTGAGGTACA 5940 

CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

TTTTGGCCAC CTTCATAACC TAATTCACTC 5060 

TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

AATTCATCCA CTAATTCAAT CAGATTACTT 6240 

GTTGTATGAT CGCTCACt GC GCCAGTTACA 6300 

ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

TGGTAATCGA TATCAATAGC TGATGAATCC 6540 
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ATGTTAATTG ATAATTTTAT TATTTGAAAT 
AACCCTTGTC ACACAAGGCT TGTATTTTTT 
5 ATCTAATTTA AAACAATATA CTAAACGTTT 

AACATGTCTT GAAACGCCTT TCATTACTCT 
GGATTCTGAG TATTTCAGAC GATTTTCTGC 

10 

TTGCAATTAC CTAAAAACAC GTTTACTTAA 
AAATGAAGAT GATACCTGAA ACGGAAATAA 
TTTCTTTTAC AGTTAAACCA AAATATTCTT 

15 

GAGACAAAAT CACACTACCT GCACCTATCG 
ATGATTGTAA TAATGGTAAG ACAATACCTG 

20 CTAATGCGAT ACGTAGCACA GCTGCAACAA 

TACCTTCAAA CATTTTAGCA ATTGTATTTC 
ATGTACCGCC ACCGCCAATA ATCAATAACA 

25 CTGATTCCAT AATATGATTC ATCTTACGCT 

ATAATACTGC TATTAGCATG GCTGTCCCTG 
ATAGATTTGT AGGTTTGTCA TGCCCAGTTA 

30 

ATATGACTGG TAATGTTGCT GTTAATAAAC 
TAAATTCTTT TTGTGCACCT AACGCTGAAA 
TCATTTTTTG TGCAcTTTGT TAAATATAGG 

35 

AATCATACCA TACAGTAATA CATCTCCAAC 
CGGTCCTGGA TGTGGTGGTA AAAAGCCATG 

■ 

TCCTAGTTTT AACACTGAAA CATTTGCGCG 

40 

XAAGACTAAA CCTACTTCAA AGAACAATGC 
TGCCCATTGT ACATGTTTTT GACCAAATTT 
45 ACCACCACCA TCAGCAAGCA ATTTCCCAAG 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC 
ACCTAGCATT AACGCTGTAA TCATCGATGT 

so 

AAACCCAATA ATTAATACTA ATAAAATAAC 
TATTTCGTTA AACATGACAT TCCCCTCTTT 

55 



ATACCTATAA ATTGTATTCA AGTCATCAGA 8460 

ATACTTATTT TTTAAATTAA ATTCATCATT 8 520 

CATAATTATC GCCTGTACAA TACGCACAAA 8580 

AAAATACCCA ATATACTTTT TATATCGTTC 8 64 0 

ATAAAAATAA ACGTGTTTCA AGGCAATATA 8700 

TATTTAGTTA AACAAATAAG CTAATGAATA 8760 

TCGTTTCTAA TAATGACCAT GTTAAGAATG 8320 

TAAACATCCA AAATCCTGCG TCATTTACAT 8880 

CAAGTACAAC TAATGCAACA TTTACATCTG 8940 

TAGTTGAAAT CGCAGCTACT GTAGCCGAAC 9000 

TCCATGCTAG TAAAATCGGA GACATCTCTG 9060 

CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

TCATTCCGAT TGGATAAATC GCATTCGTCA 9180 

TTCTCATTAA TCCCATCGTA ACGATTGCAA 9240 

CTGTTCCTAT CATATAAATG ATAGATTCAA 9300 

CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

TCATACCAAA TCCTGGCATC TCTTGATCCG 9420 

TATCGCCTTC TCGTGTATAC GCAGACGGAA 9480 

CCCTGCAATG AGTGTAACTG GaATGGCAAT 9540 

ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

TGTCACTGAT AAAGCTGTTA CCATAGGTAG 9660 

TTTTGCTACT GTAAATACTA ATGGAATCAG 9720 

AATACCGACG ATAAATGCTG CAACAAGCAT 9780 

TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 9840 

TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

AATCGTCTCC ATAATTTTAG TCAATGGTAT 9960 

GATAATTAAT GAAATAAATG TATTTAATTT 10020 

GATACCTAAA ACAACACTGA TTAACGGCCA 10030 

CTCTTTTCAA TAGAATGTAA CACCGTCGTC 10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA GCGATATGTT GGCGTTGAAA ATCTGCAATT 10260 

TGTTCATAAT TCTCTGTTAA AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

TAAACAGTGA CATTTTCTTC AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 10330 

ACGATTGAAA AATCTTCAAT GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

CATGAACTTT CATAACTTTC AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10S00 

TGACGCCATA CTTCACTTTT CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

TTCATTACTT CAATAAGCGC AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

GCAGCGCGAA TCATATGTTC TTTTTTATGA GATAAAGTTA AACCGAAGAA TGAACCTCTT 10680 

GCATTTGCGT TCCAAAGCGG CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TCTGCACCTG GTTTAACACG CTTTGCAATT TGAGTTAAGA CATCATAAGG ATCAACACCG 10800 

AGACGTTTCG CAGTTTCGAC TTCACTCGCT AGCAACTCGT CGCGCAACCA TCTCAATACG 10860 

ACACCACCAT TATTTACAGG ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10920 

AATATTCTAC CTTTGTAATC AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10980 

GTACCGATTG TGACAGCAAC TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

ACCCCATCAC TCGCACCAAT AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

TAACGTTCTT TCATACCTTT CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 11160 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA ACATCCCAAT CTAATGTTTC TAAATTAAAC 11220 

ATCCCTGTTG CGGAAGCCAT TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 1123 0 

ATGTATGTTT TAATATCTGC AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 1134 0 

TTCATCCAAA AAATCTTCGC TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 11400 

TAAATCGCAT TGCCATCATG CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 11460 

TCTGCCCAAG TAATAT7ATT TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

TGCATTTGCG CACTAAATGA CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

ATAACATATT TAATAGTCAT TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 11640 

ACATCAACGT TTGGTGTGTG TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

TTTTCATCAT ATAAGACTGA CTTGGTACTC GTCGTTCCAA TGTCGACACC AATCATATAT 11760 

TTCATGATAA ATCCTTCTTT CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11320 

CAACATCGTC GAAATTTAAA TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA 11880 

CTGCATCAAT AAACACTTGA TGATTATGAT GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 
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AAAATGAGTT TAAATATTGA 


TGATTAGATG CTTTGATTAA 


TGTTTCATGA 


AATTCAAAGT 


12060 




CATGCTTCGT AAATGATTCT 


GCATCCTCAA ATTTTACTGC 


CACTTTCATC 


ATTTCAAGTT 


12120 


5 


GTTTCTTCAT TTCTTTTACG 


ATAGGTAGTC GCTCTTGATT 


TTTAACTCTT 


GAAAATGCAA . 


12180 




ATGACTCTAA CATCAGTCGC 


AAATCATACA TTTCTTTCTT 


TTCTTGTTCC 


CCAAACGGCA 


12240 


10 


ACACATGTGC ACCCATTCTT 


TCTAATTGGA TGAGTTGATT 


TTGTTGCAAT 


AATTTAAATG 


12300 


CATCTCGAAT TGGCGAACGA 


CTCACATTAA ATTGCTTTGC 


CATTTGATTT 


TCAGTGAGTA 


12360 




ACGTACCTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG 


TAATTCTGCC 


GCGATACCTT 


12420 


15 


CTCCAGTTGT CATACCTTCC 


AACCATTTCT CTGGATATCC 


ATACATCATC 


AAAGTCACTC 


12430 




CTTCATTACA CGACATACTT 


GTATACAAGT ATGTTAATAT 


AGTTATTATG 


AGTTTGCAAG 


12540 




CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT 


CGATTTAAAT 


TTAAAGGAAA 


12600 


20 


TGGTCACTAT CACACGAATG 


ATTTAATTGT TATGTTGTAT 


GTGGGATATT 


TCTAATTGTT 


12660 




CTGTACTCAT ATGCGCTTTA 


GGTACTTCAA TGCAATAATG 


CGTTTCATGA 


CAGTTTGGAC 


12720 




ATTCGAATCG ACGTGTTGTC 


GCTGTATGTT TCGCTTTGAT 


AACTGCCCAC 


AAAGATGGTG 


127B0 


25 


AGAATATATG CTGGCAGTTA 


GGACATAAAT AGGCAACCTT 


TTGTTGGTAA 


TAAAAAGTAA 


12840 




CACCAATGCC ATAACCAATC 


ATAAATGGTA AAGCAATTAA AAACGGCCAT 


TTATTTTTCA 


12900 




TCAAAATTGC ACTTATAATG 


CTAGAATATT GAATTATTCC 


TATAATACCA 


GCACTAATCC 


12960 


30 


AAATGTTACG ACGAATACTT 


TTCATTTCAG CTGATTTACT 


CATGACATGC 


TCTATGTCTT 


13020 




TTAAGTGTGT GATTGGAGAC 


GTCGACGCTT CATTTACGTA 


ATATTGAACA 


TTTTTAATTT 


13080 


35 

WW 


TGTTTAATAC CGCTTGTTGC 


TGTTTAACTT GTTGGTTAAT 


TTCTTGTTGT 


TTCATAGTTA 


13140 


GTAAAGTATT GAGCGTCTTC 


AAAGTACCTT CACCTTTTAG 


CAACATATCT 


ATATCGCTTA 


13200 




ACGC&CAACC TAAATCTTTA AGCAATAAGA TTAACTCTAA 


TGTTTGTCGC 


TGTTGTTCTG 


13260 


40 


TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTTT CAAAATACCT 


TTGCGATCAT 


13320 




AATATTGAAT CGTTCGTGTT 


GTCACATTGC ATAATTTTGC 


GAGTTCTCCA 


GTCGAATAGT 


13380 


45 


TAGACATAGA TTCCACCTCC 


TATAATTACC ATAGTTGATG 
AAATTTATTA TACTAGGCGT 


ACCCGACGTC 


ACGAGCAAGT 
TGATTTCGTA 


13440 
13500 


ACAATTTCCA CATTTTAAAG 


CTTATTiVTA 




CCATGTTGAT TTACAAACTC 


ACTCAAACTA AGTAACACAC 


CTACTAAACA 


TCTACTCTGT 


13560 




TATTTCAGAA TGAATTTGTT 


GTAATTTATC TTCAACTTCA GTAATCTCTG 


TCGCACATTC 


13620 


50 


TTTCAGTAAA TCTCGATACT 


TTTCCGTCTC TGCATTGTTT 


TTATAACGTA 


TriTATGTTC 


13680 




TAAACTTGcC CACATATCCA 


TACCTATCGT TCTAATTTGA 


ATTTCAACAG 


GCAATACCTC 


13740 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 60 
TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT - 120 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 240 

20 TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 300 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG TCGTATAGCT 360 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 420 

25 TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 80 

ATGACATCAC ATATTCAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 540 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 600 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GATACTGTGC 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 720 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 840 

GAAT&TTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 900 

« 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 
TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 
nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 1059 
45 (2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH ; 3024 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 
ATTGTAGGTC CTGCATATCC ACAACAGGAT ATGTTAACTG AGTTAAATGG ATTTCGCGCA 
TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 
ATTGATAAAG ATAATGATGC GTTGATTG CG CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 
GGTGGTAAAC ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGATTA 
TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 
GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 
CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 
TAACCAAGAT GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 
AATAGGTAAA GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TTCAGGATAT 
TAATGAGATT GTGTTAACGG TTAATACTGA CAATCCACAT GCCATGGCAC TTTATCGCCA 
ACAAGGATAT CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 
GTTAACTATA AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 
AATGATGAAT AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 
GACCTAGTGA ACAATTGACA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 
GCGATTAATT GATAGACTCA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 
CCGTAATCCA AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 
AAATAAATTC TTGGGCAAAT ATTTTCGAGT TTATAATATG ACCAAATGAA TATTTAAGTT 
TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 
CAGATGTCGC TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 
TAACBTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 
TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 
GCCAATATTC ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 
ATGACCGAAG TAGACCGATA AATAAATGAG 7GTAATCAAC AATATTGTTG TAACGATAgT 
45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 
CGCCAATAAT AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 



25 
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ATACAATACT AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 
CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 
CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 



60 
120 
180 
240 
300 
360 
420 
430 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG 1860 

TGCACCGGAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTCCCATAA TATCTTGGCC 1920 

GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1930 

TTCATGCATA CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 2040 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

TTTAATGCCT TTACCATCTG TCATATATAT GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

CAATAAAGTG AGTATTGTTG AACAGATCAT GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

ATTGCTATGT TGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 2340 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA • 2400 

ACCAGTGATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCGG 2460 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

25 TTTGACTTTT AATTGATTTT TATATTTAAT ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT 2760 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2820 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 2880 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGGAATGTAG ACGTTTTAGT 2940 

CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

TGT^ATAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

GAGATACTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 3300 

ATAGTGCTCT TCGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3360 

AAGTTCCGTG AAACCAATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 3420 

50 TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 34 80 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3540 
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TTCATTAAAT 
TAAAGACACA 
AATGTTTAGT 
TCTTAATTGT 
ATTATAGCAA 
ATTGAACAGA 
TCTAAAAAAG 
CGTGAGGAAC 
CACGACATCA 
GGCGCAGATG 
TTCAAACAAC 
TTTTTTTCCA 
GAAGGCGAAG 
GCTAAACATG 
ATTGAAGATG 
CAGTGCGATT 
TGGTTACGAT 
TCAAAGCTAG 
GCACATCCTA 
AACTGGATGC 
AAAGTGTTTC 
GAGTTCTTAT 
TGGTTATATA 
TTAATGGAAC 
GCGCTTACAG 
ATTGTCTTTT 
GGTCAATTGA 
AATAATGATT 
CA7AAAGTTG 



AATAATTTCC 
TTAAGTTCTT 
TCAAGTGCTG 
TTAATTTCCA 
TATTATTGAT 
TAAATTTTTT 
GGGTGTGCAT 
ATGATGCGTG 
TTGATAAATC 
GCATCACTGG 
ATGTAACGGA 
AAGAACGCAT 
GGTTATCAAT 
TAGCAGATAC 
TTGAAAAGCG 
TAGAATTGTA 
CAGACCAAAT 
GGTTAGTGCA 
ACCGTATGTT 
GAGCACGCCA 
AAATTGTCGA 
CGTTAGCCAT 
ATGAAGCGAA 
CGTGGGATGG 
ATAGAAATGG 
CATCTGAAGT 
ATCCTGGAAA 
TAAAAGGTGC 
ACTTTGATTT 



CTTCAGATGT 
GTTCTAATAA 
TTTCGGAGAT 
TAGCGATATA 
AAATGTTCTA 
AGATTATAGT 
CATGCACAAT 
TGGTATTGGT 
GCTTGAAATG 
TGATGGCGCA 
CTTTGATATC 
TTTAGGTTCT 
TCTTGGTTAT 
GATGCCAGTC 
TTTGTTTTTA 
TTTTACGAGC 
TAAAAAACTA 
TTCGAGATTT 
AATGCATAAT 
ACATAAATTA 
TGAGGATGGT 
GGAGCCAGAA 
TGATGCAAAT 
TCCTACAATG 
ATTACGTCCA 
GGGTGTTGTG 
GTTATTGCTT 
GATTGCTGGA 
TGAAAATATA 



GAGCGTAATA 
TGTAATTTGA 
ATGTTCTCTT 
GGCACCTCCA 
TTTTTTAGAT 
AATTATCATT 
GAGAAATTAA 
TTTTATGCGA 
TTGCGACGCT 
GGTATTATGA 
CCAGGTGAAG 
GAACATGAAG 
CGTAATGTAC 
ATTCAACAAG 
GCGAGAAAAC 
TTATCACGCA 
TATACAGATT 
AGTACGAATA 
GGTGAGATTA 
ATCGAAACAT 
AGTGACTCTG 
AAGGCAGCGA 
GTACGTGCGT 
ATTTCGTTCT 
GGTCGTTATA 
GACGTACCTG 
GTTGATTTTA 
GAATTACCAT 
CAATATCAAG 



TTGCGTCCTT 
CGGCTTATCG 
TTAGCGACCT 
AAAATGAGTG 
GAATATCTTC 
AATAACTAAT 
TTAAAGGCTT 
ATATGGATAA 
TAGATCACAG 
CTGAAATACC 
GTGAATATGC 
TAGTTTTTAA 
CAGTTAATAA 
TGTTTATTGA 
AATTAGAGTT 
AAACAATTGT 
TATCGGATGA 
CATTCCCGAG 
ACACGATTAA 
TATTTGGCGA 
CCATTGTAGA 
TGTTACTCAT 
TTTATGAATT 
GTAACGGTGA 
CGATTACTAA 
AAAGTAATGT 
AACAGAATAA 
ATAAAGCGTG 
ATTCGCAATG 



GCTTTTTAAA 
CTGATTGAGC 
CGATAAAATA 
TTTTGTAACT 
TATTTTATAT 
ATCAGAATAT 
ATATGACTAT 
TAAAAGGTCT 
GGGCGGGGTC 
TTTTGCATTT 
CGTGGGGTTA 
AAAATATTTT 
AGATGCCATT 
TATTAGGGAC 
CTATTCGACT 
ATATAAAGGT 
TTTATATCAA 
TTGGAAAAGG 
AGGTAATGTA 
GGATCAACAT 
TAATGCGCTA 
ACCTGAACCT 
TTATAGTTAT 
CAAACTTGGC 
AGATAACTTT 
TGCTTTTAAA 
AGTCATTGAA 
GATTGATAAC 
GAAAGATGAG 
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CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 5460 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 5580 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 5640 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

15 GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 594 0 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6000 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 6060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 6180 

CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 624 0 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTG CATCA 6300 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAAT CTATT 63 50 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 6480 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAACGCTT TAATACAGGG 6540 

G CG ATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAA13AGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

45 AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 7020 

50 TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA 
TTAGTGGTGT TGGGCTGTAT TATGATGCGT 
5 GTTGCAACTC AAAACAAAGA TTTACGTGCT 

AATTTTATGC ATTTTATTGC ACAAGAATTA 
CGTGTAGAAG ACTTAGTTGG AAGAACTGAT 

10 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA 
ACAAAAGAAA TTCAACAAAA TCATAATCTT 
GAAGTAACGA AGCCATATAT TGCTGAAGGG 

75 

AATGAACAAC GTGATGTAGG GGTTATTACA 
GCAGGACTTC CTGAAAATAC AATTAATGTT 

2Q GCAGCATATG CACCGAAAGG CTTAATGATT 
GGTAAAGGAT TATCTGGTGG TACGGTCATT 
GAAATTATTG CTGGTAACGT CTCATTCTAT 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA 

ATCGGCGACC ATGGATTAGA GTATATGACT 
GGTAAGAACT TCGGTCAAGG TATGAGTGGT 

30 GAAGCTTTTG TTGAAAATAA TCAACTAGAT 
GAAGAAAAAG CATTCATTAA GCAAATGCTG 
AGAGCGATTC ATGTGTTAAA ACATTTTGAT 

35 CCTAAAGATT ATCAATTAAT GATGCAAAAA 

GAAGATGAAG CGATGTTAGC TGCATTTTAC 
AAACCAGCCG TTGTGTATTA AGGAAAGGGG 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT 
AAGCATATCA ACAACGATTT ACTAAAGAAG 
ATTGTGGAAC GCCGTTTTGT CAAACCGGAC 

45 

CAATTGGAAA CTACATTCCT GAATGGAACG 
CTTATGAACG CTTAAGCGAA ACAAATAACT 
5Q CACCATGCGA AAGTGCTTGT GTGATGAAGA 

TTGAACGCAC AATTATTGAT GAAGCTTTTG 



GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

TTATATAGAG GTAAAGCACA TCATGTTGTT 7380 

AGAGAAATTT TAGCATCTTT AGGTTTGAAA 7440 

TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7560 

GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

CGTCGCTATA CAGGTAGCTT TACAGTAAAT 7680 

GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

CATCATACTG GAGATGCGAA TGACTATGTT 7860 

GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GGTGCGACAA GTGGTAAGGC ATTTATTAAC 7980 

AATAGTGGTG TAGATGTTGT CGTTGAAGGT 8040 

GGTGGACATG TCATTAATTT AGGTGATGTA 8X00 

GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

ACGCTTTCGT TTACAAAGAT TAAACACCAA 8220 

GAAGAACATG TGTCACACAC GAATAGTACG 8280 

CGCATTGAAG ATGTCGTCGT TAAAGTTATT 8340 

ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

ATGCCTCTAT CCAAGGTGCA CGATGTATGG 8640 

AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8880 

AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 9060 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 912 0 

GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 9180 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 9240 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAGCGA 9420 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 9480 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 9540 

ACTATGCGCA CCAAGAGTAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 9660 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 9720 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 9780 

ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9840 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGCGTAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9960 

AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACGTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 10380 

TTACCTTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10500 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10 560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 10680 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10740 
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TAAAAAGAGA AGATGTAAAA GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 



10860 



CTGCAACGCA TTGTGTAACA CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10920 
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AAGACGCATT AAGTAATAAC GCGTTGGTCA AGGGGCAGTT TAAAGCAGAC CATCAATATC 10980 

AAATTGTCAT TGGTCCAGGA ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 11040 

GTGCTCAAGA AGCTTCGAAA GATGAAGCGA AACAAGCAGC TGCACAAAAA GGGAATCCAG 11100 

TACAACGTTT GATCAAATTG TtGGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

CAGCTGGTTT GTTAATGGGA ATCAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

AAGCACTTAT TGAGATGTAT CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 11280 

CGAGTACGGC ATTTATTTTC TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 11340 

GTGGTAGTCC GATTCTAGGC ATAGTCTTAG GTTTGATTTT AATGCATCCG CAATTAGTAT 114 00 

CTCAGTATGA TTTGGCAAAA GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 11460 

AGCAGTTGAA TTACCAAGGT CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 11520 

AAATTGAAAA AGGATTAAAT AAAGTCGTTC ACGATTCGAT AAAAATGTTG GTCGTTGGAC 115 BO 

CCGTAGCGCT TTTAGTTACT GGATTTTTAG CAT TT ATT AT CATTGGACCA GTTGCGTTAT 11640 

TGaTTGGTAC AGGTATTACA TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 11700 

GCGGAGCAAT ATATGGATTG TTATATGCAC CACTTGTAAT TACAGGACTA CACCATATGT 11750 

TTTTAGCAGT AGATTTCCAA TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 11820 

TTGTTGCGAT TTCCAATATT TGTCAGGGCT CTGCAGCATT TGGAGCATGG TTTGTCTATA 1188 0 

AACGTCGTAA AATGGTTAAA GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 11940 

TAGGTGTTAC TGAACCAGCC ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12000 

CTGQGATATC AACGTCTTGT GTATTGGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12060 

AAGTTGGTGT TGGTGGCGTG CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 12120 

ATCTTATTGT GACAGCTATT GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 12180 

ATTTTAGTAA ACAAAAAGCG AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 12240 

TCGTTATTTG GACGTCCTTT ATTACGTTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 12300 

ATTGGAGAAA ATCCGTTGTA TATCAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 12360 

ATGGTATAGG AGATATCAAT GGAATTATAG AAAAATTGGA TTATATCAAG TTATTGGGTG 12420 

TTGATTATAT TTGGTTAACA CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 12480 

TCAGCAATTA TTTAGAAATC aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12540 
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CG ACGGAGCA TGaATGGTTT AAAGAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 126 60 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

GTAATGCATG GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACGTCA ATCGTTATAT CGCATAGTCA 12840 

ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA 12900 

AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 13020 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TGATTATGTT GATGGTGAAA 13140 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 132 00 

GAGGTATTTA TGACGGTGGC GGATGGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13260 

GGGTAGTGTC TAGATTTGGT GATGATACGT CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA 133 80 

TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13 500 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13560 

GATTTACAGC TGGTAAnCCT TGGATTGATA TTTCGGAAAA TTATCATCAG GTCAACGTTA 13620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 13680 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATTTATTTGT TTATGAACGT CATTATAAGA ATCAACAATG GCTAGTAATT GCGAATTTCT 13800 

CAGCATCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13860 

« 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13920 

CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

ATTTGTTGGC ATTAGACGGC ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

ATCAGGAGGT TACAGAGTTT CGATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14220 

AAATGGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14230 

TTCCAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 14340 
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AATAATGGCG TTGATGAAAT AAGAAATATT 
TCGAAATATA AAGTTTATAT TATAGATGAG 
5 GCCCTTTTAA AGACGTTAGA AGAACCTCCA 

GAACCACATA AAATCCCTCC AACAATCATT 
ATTAGCCTAG ATCAAATTGT TGAACGTTTA 

10 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT 
TTAAGTATTA TGGATCAGGC TATTGCATTT 
TTGAATGTCA CAGGTAGCGT ACATGATGAA 

15 

CAAGGTGACG TACAAGCATC TTTTAAAAAA 
GTGAATCGCC TAATAAATGa TATGATTTAT 
TCTGAGAAAG ATACTGAGTA TCGAGCACTG 

20 

ATGATTGATC TTATTAATGA TACATTAGTG 
CATTTTGAAG TGTTGTTAGT AAAATTAGCT 

25 GCGAATGTAG CTGAACCAGC ACAAATTGCT 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA 
CCTGTTCAAA AATCTTCGAA AAAGCCTGCG 

30 TCAATGCAAC AAATTGCAAA AGTGCTAGAT 

AAAGATCATT GGCAAGAAGT GATTGATCAT 
AGTTTATTGC AAAATTCGGA ACCTGTGGCG 

35 GAGGAAGAGA TCCATTGTGA AATCGTCAAT 

AGTGTTGTAT GTAATATCGT TAATAAAAAC 
TGGCAAAGAG TTCGAACGGA ATATTTACAA 

40 AAGCAACAAG CACAACAAAC AGATATTGCT 

ACTGTACATG TGATAGATGA AGAGTGATAC 
AAAGAAACAT CATTTTATTG ATAAATATTT 

45 

GCGGTGGCGG AAACATGCAA CAAATGATGA 
CTCAAGAACA AGAAAAACTT AAAGAAGAGC 
TTGCAGTTAC TGTAACTGGT CATAAAGAAG 

SO 

TAGACCCAGA CGATATTGAA ATGCTACAAG 



AGAGACAAAG TTAAATATGC ACCAAGTGAA 15260 

GTGCACATGC TAACAACAGG TGCTTTTAAT 16320 

GCACACGCTA TTTTTATATT GGCAACGACA 16380 

TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 15440 

AAATTTGTAG CAGATGCACA ACAAATTGAA 16500 

AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16560 

GGTGATGGTA CGTTAACATT GCAAGATGCG 16620 

GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

TTTGTCaGAG ATACGATTAT GAATAAAACA 16800 

ATGAACTTAG AATTAGATAT GTTATATCAA 16860 

TCGATTCGTT TTAGTGTGAA TCAAAACGTT 16920 

GAGCAGATTA AGGGTCAACC ACAAGTGATT 16980 

TCATCGCCAA ACACAGATGT ATTGTXGCAA 17040 

ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

AAAGCGAATA AGGCAGATAT CAAA7TGTTG 17220 

GCCAAAAATA ATGATAAAAA ATCACTCGTT 17230 

GCAAGTGAAG ATCACGTACT TGTGAAATTT 17340 

AAAGACGACG AGAAACGTAG TAGTATAGAA 17400 

GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

AATCGTAAAA ACGAAGGCGA TGATATGCCA 17520 

CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 

GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 

TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 

ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 
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TCCCTGOaAT GTGATCATAG ATGCATTATC CAGAACCTAT ATCAAAACTT ATTGATAGCT 13060 

TTATGAAATT GCCAGGCATT GGTCCAAAGA CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

ATATGAAAGA AGACGATGTT GTTCAGTTTG CCAAAGCATT AGTAGATGTT AAGAGAGAAT 19180 

TAACATATTG TAGCGTATGT GGTCACATTA CTGAAAATGA TCCATGTTAT ATTTGTGAAG 19240 

ATAAGCAAAG AGATCGTTCA GTTATTTGTG TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

TGGAAAAAAT GAGAGAATAC AAAGGTTTAT ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TGGATGGCAT TGGACCAGAA GATATTAATA TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ATGAAGTTAG CGAATTAATC TTAGCTATGA ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

TGTATATTTC TAGATTAGTT AAGCCTATAG GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

TATCGGTAGG TGGCGATTTA GAGTATGCTG ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

GTAGAACAGA AATGTAATkT CTTCTATTAA ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

AAGTCACAGT GTAATCATTG TGGCTTTTTT TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

GCGGTGTGGC GGTGGTATGG TTTACCTAGT TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

CAAGCCGTTG GTTGTGATTT GTTACTTCTA ATAGTAATGA TGTGAATTGG ATTATCGAAT 18840 

TAGATCTATG GTTATGGTGT GTTGGTGCTA TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

CAAATGAAAT TCTTTTGTAA TTGAAATGAT AGATGCTGGC TTAGTAAGTT GTACTTCTTT 1B960 

GGTCTAAAGC TTATTAAATC AGCCTGTATA GCGGTGTTTT GAGAGATTAT TTAAAACTTG 19020 

TAAATTTATT TTTAATTTCT GGTAAAAAAA TAACGTTCTG TTTTGCGTTT TTTTTGATTG 190 30 

ATATGGTTAG AGAAAAATCT GTT7CTTGTT CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

TTTTTAAGTT CGATTTTTAG GATAAGGGCG TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CTCAACTTAA GAAATAACTT GAATTACTAA CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

AAATGTTAAT AAAATGTATA ATTAATTCTT GTCGGTAAGA AAAATGAACA TTGAAAACTG 19330 

AATGACAATA TGTCAACGTT AATTCCAAAA AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

GTATTTATGA GCTAATCAAA CATCATAATT TIT AT GG AG A GTTTGATCCT GGCTCAGGAT 19500 

GAACGCTGGC GGCGTGCCTA ATACATGCAA GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CTGATGTTAG CGGCGGACGG GTGAGTAACA CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ACTTCGGGAA ACCGkAGCTA ATACCGGATA ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

AGACGGTCTT GCTGTCACTT ATAGATGGAT CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA GCAGTAGGGA ATCTTCCGCA ATGGGCGAAA 19860 

gCtGaCGGAG CAACGCCGCG TGAGTGATGA AGGTCTTCGG ATCGTAAAAC TCTGTTATTA 19920 

5 GGGAAGAACA TATGTGTAAG TAACTGTGCA CATCTTGACG GTACCTAATC AGAAAGCCAC 19980 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT ACGTAGGTGG CAAGCGTTAT CCGGAATTAT 20040 

TGGGCGTAAA GCGCGCGTAG GCGGTTTTTT AAGTCTGATG TGAAAGCCCA CGGCTCAACC 20100 

10 GTGGAGGGTC ATTGGAAACT GGAAAACTTG AGTGCAGAAG AGGAAAGTGG AATTCCATGT 20160 

GTAGCGGTGA AATGCGCAGA GATATGGAGG AACACCAGTG GCGAAGGCGA CTTTCTGGTC 20220 

TGTAACTGAC GCTGATGTGC GAAAgCGTGG GGATCAAACA GGATTAGATA CCCTGGTAGT 20280 

75 

CCACGCCGTA AACGATGAGT GCTAAGTGTT AGGGGGTTTC CGCCCCTTAG TGCTGCAGCT 20340 

AACGCATTAA GCACTCCGCC TGGGGAGTAC GACCGCAAGt TGAAACTCAA AGGAATTGAC 20400 

GGGGACCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA AGCAACGCGA AGAACCTTAC 20460 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA GATAGAGCCT TCCCCTTCGG GGGACAAAGT 20520 

GACAGGTGGT GCATGGTTGT CGTCAGCTCG TGTCGTGAGA TGTTGGGTTA AGTCCCGCAA 20580 

CGAGCGCAAC CCTTAAGCTT AGTTGCCATC ATTAAGTTGG GCACTCTAAG TTGACTGCCG 20640 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG TCAAATCATC ATGCCCCTTA TGATTTGGGC 20700 

TACACACGTG CTACAATGGA CAATACAAAG GGCAGCGAAA CCGCGAGGTC AAGCAAATCC 20760 

CATAAAGTTG TTCTCAGTTC GGATTGTAGT CTGCAACTCG ACTACATGAA GCTGGAATCG 20820 

30 

CTAGTAATCG TAGATCAGCA TGCTACGGTG AATACGTTCC CGGGTCTTGT ACACACCGCC 20880 

CGTCACACCA CGAGAGTTTG TAACACCCGA AGCCGGTGGA GTAACCTTTT AGGAGCTAGC 20940 

35 CGTCGAAGGT GGGACAAATG ATTGGGGTGA AGTCGTAACA AGGTAGCCGT ATCGGAAGGT 21000 

GCGQCTGGAT CACCTCCTTT CTAAGGATAT ATTCGGAACA TCTTCTTCAG AAGATGCGGA 21060 

ATAACGTGAC ATATTGTATT CAGTTTTGAA TGTTTATTTA ACATTCAAAT ATTTTTTGGT 21120 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA GTATGCGAGC GCTTGACTAA AAAGAAATTG 21180 

TACATTGAAA ACTAGATAAG TAAGTAAAAT ATAGATTTTA CCAAGCAAAA CCGAGTGAAT 21240 

AAAGAGTTTT AAATAAGCTT GAATTCATAA GAAATAATCG CTAGTGTTCG AAAGAACACT 21300 

45 CACAAGATTA ATAACGCGTT TAAATCTTTT TATAAAAGAA CGTAACTTCA TGTTAACGTT 21360 

TGACTTATAA AAATGGTGGA AACATAGATT AAGTTATTAA GGGCGCACGG TGGATGCCTT 21420 

GGCACTAGAA GCCGATGAAG GACGTTACTA ACGACGATAT GCTTTGGGGA GCTGTAAGTA 21480 

SO 

AGCTTTGATC CAGAGATTTC CGAATGGGGA AACCCAGCAT GAGTTATGTC ATGTTATCGA 21540 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 2 16 SO 

ACCAACAAGC TTGCTTGTTG GGGTTGTAGG ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

TTAGACGAAT CATCTGGAAA G ATGAAT CAA AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC GGAGCACGTG AAATTCCGTC GGAATCTGGG 2184 0 

AGGACCATCT CCTAAGGCTA AATACTCTCT AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

GAAAGGTGAA AAGCACCCCG GAAGGGGAGT GAAATAGAAC CTGAAACCGT GTGCTTACAA 21950 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GATTTGATGC AAGGTTAAGC AGTAAATGTG GAGCCGTAGC GAAAGCGAGT CTGAATAGGG 220 80 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC AGGTGATCTA CCCTTGGTCA GGTTGAAGTT 2214 0 

CAGGTAACAC TGAATGGAGG ACCGAACCGA CTTACGTTGA AAAGTGAGCG GATGAACTGA 22200 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG AGATAGCTGG TTCTCTCCGA AATAGCTTTA 22260 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG TAGAGCACTG TTTGGACGAG GGGCCCCTCT 22320 

CGGGTTACCG AATTCAGACA AACTCCGAAT GCCAATTAAT TTAACTTGGG AGTCAGAACA 223 30 

TGGGTGATAA GGTCCGTGTT CGAAAGGGAA ACAGCCCAGA CCACCAGCTA AGGTCCCAAA 22440 

ATATATGTTA AGTGGAAAAG GATGTGGCGT TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22500 

GCAGCCATCA TTTAAAGAGT GCGTAATAGC TCACTAGTCG AGTGACACTG CGCCGAAAAT 22 5 SO 

GTACCGGGGC TAAACATATT ACCGAAGCTG TGGATTGTCC TTTGGaCAAT GGtAGGAGAG 22620 

CGTTCTAAGG GCGTTGAAGC ATGATCGTAA GGACATGTGG AGCGCTTAGA AGTGAGAATG 226 80 

CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA TCCCGTCCAC CGATTGACTA AGGTTTCCAG 22740 

AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 22300 

TGGA7AACAG GTTGATATTC CTGTACCACC TATAATCGTT TTAATCGATG GGGGGACGCA 228 SO 

4 

tAGGATAGGC GAAgcGTGcG ATTGGATTGC ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

AAATCCGGTA CTCGTTAAGG CTGAGCTGTG ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

TTGATTTCAC ACTGCCGAGA AAAGCCTCTA GATAGAAAAT AGGTGCCCGT ACCGCAAACC 2 3 040 

GACACAGGTA GTCAAGATGA GAATTCTAAG GTGAGCGAGC GAACTCTCGT TAAGGAACTC 23100 

GGCAAAATGA CCCCGTAACT TCGGGAGAAG GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 23160 

GCCGCAGTGA ATAGGCCCAA GCGACTGTTT ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

AGGTGATGTA TagGGcTGAC GCCTGCCCGG TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT '23280 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA 
AGGACGGAAA GACCCCGTGG AGCTTTACTG 

5 TACAGGATAG GTAGGAGCCT TTGAAACGTG 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC 
TCAGGCGGGC AGTTTGACTG GGGCGGTCGC 

10 TTCCCTCAGA ATGGTTGGAA ATCATTCATA 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA 
AAGGGCCATC GCTCAACGGA TAAAAGCTAC 

15 

AGTTCACATC GACGGGGAGG TTTGGCACCT 
AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC 
CGTCGTGAGA CAGTTCGGTC CCTATCCGTC 

20 

CTTAGTACGA GAGGACCGGG ATGGACATAC 
ATAGCTGGGT AGCTATGTGT GGACGGGATA 
CCTCAAGATG AGATTTCCCA ACTTCGGTTA 

25 

GGTTCGAGGT GGAAGCATGG TGACATGTGG 
AATCAAAATA AATGTTTTGC GAAGCAAAAT 

30 ATAAATTACA TTCATATGTC TGGTGACTAT 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT 
AACGTTGCCA GGCAAAAAAT GGATGCGATG 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA 

AAACDTTTGA ATCTGACGAA ACGAGAAAAG 
TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC 

40 TAGCGAsGAT GGTAGCCAAC TTACGTTCCG 

AATGTACACT TTCGATTGTC TAAGTATGTA 
AAATGATATC ATCGAAAACA AAATATTGTA 

45 AATTGAAAAT GATCTTACTG CTCTTTTATA 

TTATTATACA ATAGACAAGC TATTGCATAA 
CTTTATAATT AATGAT7TTA TTAGAGCGTC 

50 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT 



GTACCTGTGA AGATGCAGGT TACCCGCGAC 234 60 

TAGCCTGATA TTGAAATTCG GCACAGCTTG 23520 

AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 23 580 

CCGCACCACT TATCGTGGTG GGAGACAGTG 2 3640 

CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 2 3700 

GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23760 

CGGACTTAGT GATCCGGTGG TTCCGCATGG 23820 

CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23880 

CGATGTCGGC TCATCGCATC CTGGGGCTGT 2394 0 

ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24060 

CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

TAAGATCCCT CAAAGATGAT GAGGTTAATA 24 24 0 

AGCTGACGAA TACTAATCGA TCGAAGACTT 243 0 0 

* 

CACTTTTACT TACTATCTAG TTTTGAATGT 243 60 

AGCAAGGAGG TCACACCTGT TCCCATGCCG 244 20 

GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 24480 

AGCCGCATTG AGACCGCAAG GTCTCTTTTT 2454 0 

AACACAAAGA AAAATGGCTT GGCGAAGTGA 24600 

ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 660 

AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24720 

CTAGAGTAGA ACTGGAAATG ATAATTTAAT 247 80 

CAACTTTAAT TTTGTGTTTA TATAAATTTA 24840 

TAAATAGAGA AGAGCAGTAA GACGGTATCT 24900 

TACTTTATTG AAATACAAAA AGGAAATTAA 24960 

GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

TTTCTTATAG TCTTGATCAT CATCAAAATT 25140 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

CTCTTTGTAA GGTTTCAATC TTTTTAAAAT ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

CATTTTATTT AAATGCCTTT CAAAACCACC GGAAGATATA AACGTTGCAA TAAGGTTTTG 253 80 

CATATGAACA GGTACAGTGT TGCCTTCAAT GTGATTTTGA GAATGATATT TTTTCATTAT 25440 

AGAATAGGGT AACACCATAT ATGCAACTCG ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ACTGATATAA ATCACTTTTT CTCCTCTTGA ATATAGACCT TGAATTGCTG GAATGGGTTT 25560 

GCCGAAATAT CTAAACTCGG AATCATAATC ATCTTCTATA ATAAATCGTT CTTCTTTTTC 25620 

TTGAGCCCAT TGTATTAATT GAGTTCGTTT TTTTAAGTCC ATCACATATC CAGTTGGAAA 25S80 

TTGATGGGAA GGCGTTATAT ATACTATATT TTTTTGTGAT TTAATAACTT CATCTACGTT 25740 

TATTCCATTA TCTTCAACTT CAATTTGTTC ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

TTTGATTGGT GGATAACTAG GTTTTTCGAT AATAAATGTT GAAGTATAAA GTAAATCGAC 25360 

TAATTGATTT ACTAATTGTT CGGTAGATGA GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

TACGCCACGA TTAGTAAATA AATAAAATGC CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25930 

AAAATGTCCT CTACGTAATT GATTTAAATG ATTTGTATCA TAAAGATCTT TGGAATACTT 26040 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT ATCTATTTCA TCCAAATTAA AAGCATAATC 25100 

ATAAGCTTCA TCACTCGCTT TTGGTTTATA TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

TTGATTGTTT AAAATTGTTA AAGATTCAAT TTCGGACACA AAATATCCAG AGCGAGGTCT 26220 

TGAATAAATG TAACCTTCGT CTAATAGAAG TTGATATGCA TGCTCTACGG TTGTTTGGCT 26280 

AATAGATAAA TGTTTGCTTA ATTGTCTTTT AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

ACCTTCAATT ATTTGTTTTT TTAATTTTTC ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 26400 

TTTT&TAACT GACCTCCTAA ATTTATCTTA TTTTGTACCT TTTTAAATAT CAGTTTATAC 25460 

ATTACAATGT ATTTAATCAA CTTGAAAAGG GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 

ATCAGACAGA GTCAAAAGAG GTATGGCTGA AATGCAAAAA GGCGGCGTTA TTATGGATGT 26580 

CGTTAATGCT GAGCAAGCAA GAATTGCAGA AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

AGAACGAGTA CCTTCTGATA TTAGAGCTGC TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

AATTGTAGAA GAAGTAATGA ATGCTGTTTC TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

TCATATCACT GAAGCAAGAG TATTAGAGGC GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

AGTGTTAACA CCAGCAGATG AGGAATATCA CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC TGTAATGAAT GATGATGAGA TTATGACTTT 270 SO 

TGCGAAAGAT ATCGGTGCGC CTTATGAAAT TTTAAAACAA ATTAAAGACA ATGGTCGTTT 27120 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT TGCGACTCCT CAAGATGCTG CTTTAATGAT 27180 

GGAATTAGGT GCTGACGGTG TATTCGTTGG ATCAGGTATT TTTAAATCAG AAGATCCAGA 27240 

AAAATTTGCT AAAGCAATTG TTCAAGCAAC AACACATTAC CAAGACTATG AACTAATTGG 27300 

AAGATTAGCA AGTGAACTTG GCACTGCTAT GAAAGGTTTA GATATCAATC AATTATCATT 27360 

AGAAGAACGT ATGCAAGAGC GTGGTTGGTA AGATATGAAA ATAGGTGTAT TAGCATTACA 27420 

AGGTGCAGTA CGTGAACATA TTAGACATAT TGAATTAAGT GGTCATGAAG GTATTGCAGT 27480 

TAAAAAAGTT GAACAATTAG AAGAAATCGA GGGCTTAATA TTACCTGGTG GCGAGTCTAC 27540 

AACGTTACGT CGATTAATGA ATTTATATGG ATTTAAAGAG GCTTTACAAA ATTCAACTTT 27600 

ACCTATGTTT GGTACATGCG CAGGATTAAT AGTTCTAGCG CAAGATATAG TTGGTGAAGA 27660 

AGGATACCTT AACAAGTTGA ATATTACTGT ACAACGAAAC TCATTCGGTA GACAAGTTGA 27720 

CAGCTTTGAA ACAGAATTAG ATATTAAAGG TATCGCTACA GATATTGAAG GTGTCTTTAT 2778 0 

AAGAGCCCCA CATATTGAAA AAGTAGGTCA AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27840 

GAAAATTGTA GCTGTTCAGC AAGGTAAATA TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

AGATGACTAT AGAGTAACTG ATTACTTTAT TAATCATATT GTAAAaAAAG CATAGCTTAA 27960 

TGTATGCTAA ATCAACGAAT TATTGATATT TATAGATTTG TTGAGAAGAA AATATCTCCT 2 8020 

TCAAACTTAG CTTTGGAGGA GTTATTTTTT ATGTCAAAAT TAAAAATGAT AAAAAATAAA 2 8080 

GCTATACATA AGAAAAAAAC CCTTCAAAGA GACTGAGAAT AGTCAAAATT TTGAAGGGGT 28140 

TAATTCGATG TTGATGTATT TGTTAAATAA AGAATCcAGC GATTGCAGCT GAAATGAAAG 28200 

ATACTAGTGT tGCACCGAAT AATAATTTCA AACCAAAGCG GGCAACTGTA TCTC CTTTTT 28260 

TGTCATTAAG TGATTTAATC GCACCTGAAA TAATACCGAT AGAGCTAAAG TTAGCAAATG 28320 

ATACTAAGAA TACAGATGTA ACACCTTTTG CGTGTTCAGA TAAATCACTA AGTTTACCAA 2 8380 

GTGCTTGCAT TGCTACAAAT TCGTTAGATA ATAGTTTTGT CGCCATAACT GAACCGGCTT 28440 

GAACTGCATC TTGCCATGGC ACACCGACTA AGAATGCAAA TGGTGCAAAG ACAAAACCAA 28500 

TTAATGTTTG GAAATCCCAA GAAATAGCGC CACCTGAAAC TGTACTAAAG ATATTGCTTA 28560 

CAATTCCATT TAATAGAGCG ATAATGGCAA TGTATCCGAT TAACATTGCG CCTACAATGA 2 8620 

CAGCTACTTT AAATCCATCT AAAATATATT CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 28530 

TTTCTTCAGT TTCTTCAACT AATAATTTGT CATCTTCTTC ATTAACTTTA TAAGGGTTAA 28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 28860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 28920 

5 TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 28980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 29040 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

10 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 29X60 

CAT CTGCTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 2 9220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 29280 

15 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC CCAATTAGAA T ATG CAT AT A TTTCTCATTC CTTTAGTTTT 29400 

20 TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 29460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGTCAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 29580 

25 TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29640 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

30 

AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 2 9820 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 2 98 80 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 2 9940 

35 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30000 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 30120 

40 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 

45 TCAAGA 30246 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 



438 



EP0 786 519 A2 



10 



75 



25 



(Xi) SEQUENCE DESCRIPTION: S2Q ID NO: 57: 

TATTCCCCCA TCGGTTTATT AAATCGTCCA TTTCAATACT GTTTTTCCCC AAGATGTCGA 60 

TAAATCCATT TCAAACGCTT GGACGATATC TTGCATCGTA CATACATTAA TTTCATGTCC 120 

TTTTAATAAT GCTAACTTTT CAACTATGTC TGGGTACTTA CGATATAAAT CAACAACTTG 180 

CTCAAAATCT TTAGAGCCGC TTCGACTACT ACCAATCAAC GTTAATCCTT TTTCAAGTAC 240 

TAATCGTGTA TTCACTTCCA CGGGTAATTC ACTTACGCCT AACAAAGCAA TACTGCCTTC 300 

TGGTGAAATA TGTTCAACTA TTTGTTGAAG TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

TTCAAATGCA TGATCAATTT TAAGATCATC TGGTATTTGA TTTACTGTAA AGATGTCATC 420 

TACAAATGAA AAATGACTTA ATTTATAGTC TGTCTTACCA AATACATAAG TTTTAGCTTC 4 80 

TGGGTACAAC TTACGTAGCA AAATAGCAGT AATATAACCT AAGTTACCAT CACCCCAAAT 540 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT ACGTTCAAAT CGTTGTATAG CATGATAACT 600 

TACTGACACT AACTCTGTGT ATGAAATCGT ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

GATACGATCA TGTGCCATCA CAACGTAGTC TTGCATAAAA CCATCATAAC CACTAGATCT 720 

AAAATAACTA GAGGCTAAGT AATTCTCCGC AATAATATGA TGTTGCTCTG TAGGTGTATT 730 

CGGTACCATT ACTACTTTCG TACCTTTTTC AAATACCCCT TTACTATCAA ATACAACTTC 840 

ACCAACAGCT TCATGAACTA ATGACATTGG TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

TCGACCTGTG TAATACCTTT GATCAGCTGC ACAAATAGAC AAGTATAAAG GTCTTACGAT 960 

GACATGATTA CCATAAATAT CAACATTATT ATATGTGACG TCGAACTGTC TCGGTGCAAC 1020 

GAGTTGATAT ACTTGATTAA TCATCGGCAA TATCACCTTG AATAATGGCA TTTGCTACTT 10 80 

TTAAATCATA CGGTGTTGTC ACTTTAATGT TGTATAGTTC TCCaCGTACC AATTTAACTG 1140 

CATGTCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT TGTTCACTAC 1200 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT TAATATTAAA TGATTGTGGT GTTTGGCCTT 1260 

GATACATTTC ATTCCTTACA GGGATACTGT GTATGTTCTG TTTATCTTTA GACATTACAA 1320 

TCGTATCAAT TGCTTCAATG ACTGTATCTA CTGCACCATA TTTTGCTGCT ACTTCAATGT 13 80 

45 TCTCTTTAAT AATACGTTGA GTTAAAAATG GTCTTACGGC ATCATGAGTT ACAATCACAT 1440 

CATCATTATT AATTCCATTT ACATTGCGAA TATGGTCGAT AATGTTCATA ATTGTTTCAT 1500 

TTCGATCCGT ACCACCTGCA ACTACTTTGA CACGTTGATC TGTAATGTTA TATTTTTTTA 1560 

50 

AAATATCCTG TGTATGGGAA ATCCACTGTG CTGGCGTTGC GATAATAATC TCATTAAATT 1520 

CACTCACTAA AATGAACTTC TCAATTGTAT GGATTAAAAT CGGTTTATTA TCAATATCTA 1630 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT 
ACCTATGCCC GCACATAAGC CTAACCTATT 
5 TAGTTGTCAC AATAGTGTGA TAATTTTTTA 

AAGTTGTTTT GCCATGCAGT TAATCATTAA 
AATGTTTACT CTTTTTCAAA TTCATTATTA 

10 

ATTTATCTTA TTAAGTGGCT GTACTTGATT 
CTCATTTTAA GTATACAAAA TGCAAAACAA 
ACCGGCTATT TATCAACGTA TATTCGAAGA 

15 

ACGGCATTCG CACTTTCATA GCTATAACTA 
CTAATAAATC GTAAACATGA CTTTATCAAA 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT 
ACAGTTCCAA GAGCCACATA CTGTCAACTC 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC 
TAGCGTCAA7 ATTTGACCAA TCGTAACTTT 
TTGATTGGCG TAATGATTTT CGATGAATTT 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC 
GATATGTGCA GCACCAAATA TTCGTGCCCA 
CCCCATTACT GCAACAGTCA TACCAGGTTG 

35 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC 
AACAJTTTGC GCTGGCAATT TGACATATTC 

40 GACGAATAAC TTTTCACATC GTGCATATTC 
AGGTATTGCT GGGCAACCTG TCACTTTGTC 
AATGGCATCT ACTACACCTG AAAATTCATG 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC 
AACGTCATTC GCACTTTCAA TGACTGGCTT 
GCCATATAAT TTCAATGCTT TCACTTGTAA 

50 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA 
AGTAAATGTT CCATATAAAA ATCAGTGATT 
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CATTACATCA TTTCCATTTA TACATTACTG 1300 

GCTCACTTGC CTCTTTTATT AATCCAAAGA 1860 

TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

CTCTGACGAT ATTAAATTGT TAAAGGTATT 1930 

CTGCCATCAT TTTACCATAT ATTATAATAA 2040 

TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

TGAATTATTT CGATAGTATC TATAGACCAG 2220 

TACCAGCGTT TTCGTCCTCA AAGGTGCATA 2280 

TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2340 

ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 

GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

GAGTAATACC ACCTCGCCAC C TTTTTTA GG 2580 

AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

TTGAATAGCT AACAAACCTA TACTGCCACA 2320 

TATATTCGAT TTATAAAACC CATGCGCAAC 2380 

AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

ACCTTTTAAA CAATACTCGC ATTGATAACA 3050 

GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

ACCAAATGGC ATACCTTTAA TGTATGGCCC 3130 

ACATATGCCA GTCGCTCGTA CTTTAATAAT 3240 

TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 3480 
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TGCTTTAATA CCTTCGCCGG 


AT7TTAAATG 


TTGATACGCC 


TCGTCCCATT 


TCGAAATATC 


3600 




ATATATTTTT GTCACCAAAG 


CTTCAGCATT 


TACTAAACCA TCCGCCATAA 


GTTGCAATGA 


3560 


5 


AGGTTCCCAA TCTGCTGGCT 


TTTGACTTCT 


ACTACCAACA ACTGTTATTT 


CTTTTTGAAT 


3720. 




CACTTTTTCC ATATCAAATG 


GAATTTCAGC 


ATCCTTAAAA 


ATACCTATTT 


GACTGTAGAA 


3780 


w 


ACCTTTTTTG CGTAAAATAT 


CCAAACCTTG 


TCGTGCTGCT 


GGAACTGCAC 


CTGAACATTC 


3340 


AACAACAACA TCTGCACCGT 


AACCGTCTGT 


AATTCCATTG 


ATATACGTTT 


TTAAGTCTGT 


3900 




TTGTTGTAAA TTGACTACAT 


AATCCATGTG 


CAATGCTTCT 


gctttatcta 


ATCTGACTTT 


3960 


15 


GTCATTGTCC AATCCAGTTA 


CCACAACAGT 


TGCGCCTTTA 


CTTTTTAACA 


CTTGTGCTAC 


4020 




AAGTAATCCG ATTGGCCCAG GTCCCATTAC AACTGCTACA TCGCCTGAAT TGACTTGAAT 


4080 




CTTAGAAACG CCATGATGTG 


CACATGCTAA 


TGGTTCTGTC 


ATAGCTGCAG 


ACTGATACGA 


4140 


20 


TAtTCGTCTG GAATATGATG 


CAAACTTTCT 


TCACGTGCAA 


TGACATAATT 


AGTAAATGCG 


4200 




CCATCAACTT GTGTTCCAAT 


ACCTTTTCGA 


TGGTTGCATA 


AATTATAGTC 


TTTTGATTTA 


4260 




CAGTATTCAC ACTCATTACA 


AACATAGAAT 


GTCGTTTCAG 


aTGtGACACG 


GTCACCAACT 


4320 


25 


TTAAAATCTT TAACGTCTGC 


TCCAACTTCA ACGATTTCAC 


CAGAAAATTC 


ATGACCTAAT 


4380 




GTCACTGGAA AATTAACTTT 


ATAATGACCT 


TCATAAGTAT 


GAATATCTGT 


GCCACAAATT 


4440 




CCTGCATAAT GTACTTTAAT 


CTTTALTTTA 


TCATCTAGCG 


GTGTTGCAAC 


TTCTTTATCA 


4500 


30 


AGAAGTTCTA AGTTGCCATG 


TCCTTCTCTT GTTTTTACTA AAGCTTTCAC 


CACAAACACC 


4560 




TCGATTTTTA ATTGAATAGA 


CTAAATAGTT 


TAAAGATAAG 


ATAGTTAACG 


ATATTACCAC 


4620 


o - 
JO 


CTTGATCAAT ACTTGAAATT 


TCAGATGAAC 


CTTTTGGCAT 


TTGTACATTC 


GTACCTTTCG 


4680 


CCATATCTGT GAAAATGGGT 


GCTACGTCTG 


TTGCAATATA 


TAGTGAAATT 


GCAATCATAA 


4740 




TCGTACCCAC AATGACAGAA 


TGAATAATGT 


TTCCTCTTGC 


TGCACCAACA ATAAACGCGA 


4300 


40 


CAAGAAATGG TATCGTTGCT AAGTCACCAA AAGGTAGTAC TTGGTTTCCT GGTAAAATAA 


4360 




CGGCTAATAA AACAGTGATA 


GGTACTAAAA 


TTAATGCTGT 


CGAAATAACT 


GCTGGATGAC 


4920 




CTAATGCTAC AGCCGCATCC 


AATCCAATAT 


AAATTTCACG 


TTCGCCAAAA 


CGTTTATTTA 


4980 


45 


GCCATGTTCT TGCAGACTCT 


GAAACTGGCA 


TTAAACCTTC 


CATTAAGATT 


TTTACCATTC 


5040 




7AGGCATTAA TACCATTACT 


GCAGCCATTG 


ACATTCCTAA 


ATTAA7GATG 


TCTCCAGGTT 


5100 




TGTAACCTGC TAACACACCA 


ATACCTAAAC 


CTAAAATTAA 


GCCGACAAAT 


ATAGACTCTC 


5160 


SO 


CAAATGCGCC AAAACGTTTT 


TGAATTGTTT 


CAGGATCAGC 


ATCTAACTTA 


TTCAGACCGG 


5220 




GTACTTTTTG TAACAATTTA ACTAAGTAAA TACCTGGTGC ATAAGAAATT 


GTACTTCCTG 


5280 
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10 



25 



CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 5400 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAATATCTA 5460 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 5520 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 55 BO 

CATCAATCAC ATTCAGACTG ACGCCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 5640 

AATTTTTAAC TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 5700 

AACCAGACCT AAATGCCGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

1S CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 5820 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CGATACTTTC CGATTGTGAT SB80 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT GCAGTCCTTT 5940 

20 TGAATTACAG tCACTGCGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AGTCATTTAT 6000 

ATGCAATTTG CATATATTAA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT 6060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 6120 

CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTGTCA TTGCAGTTGT AACTAATAAA 6180 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 624 0 

ATATTGTGTT CCTTTGCCAT TTCTTCAATT GCATTATTTA CTACTGTTGA CGTTGCAATA 6300 

CCTGCACCAC ACGCTACTAA TACTTGTTTC ATTTTCAATT CCTCCAATTA ATTTTTAGTT 636 0 

ATATTCCAAA TAATCATTGA TTAGTGTTGC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 642 0 

CTGCTCCAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 6480 

ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 6540 

TGTTCCCATT TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6600 

■ 

40 GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA CATAGATGCG TTGGTAAACC 6660 

AGTAGCAAAT TCTTTTTCTC TGTCGATGAC TGCATCTTTA AACGTTGACT TCACGAACCC 6720 

ATTTTGAAAT AACACA7CTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGCCGACAA 6780 

4o ATTGAGCATT ATATTTTCTT TATGCACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 6840 

TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6900 

CGCTTAATTG CGTCATTGTC TTGCGCCACA TCTCTCAATT GTAGTAACGC TCTTAAGTGT 6 960 

SO 

GTCACTTTAT CAACAGCAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 7020 

TA7ATTGGTT CTTGTAATAT CAACATACTC ATCGCTGTTT TATGTACATG CTTTTCAGAG 7080 

55 
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10 



15 



20 



25 



30 



35 



40 



45 



SO 



TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 
TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT 
ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGtGACaA TCGATCCATT AATGGTTGAA 
ATTGAATTAT AATTGGCAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 
CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCTTATCA ATGTCAGCCG 
ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCGTTGCTCA TTTTCAGATA 
AAGCTACTTT TGTGATAAAT AATTTTTTAT GTGTTAGGAC AAACATTGGT GAAAAGACGA 
TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 
TTTCTGGAAA TAAGTTTCGC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTGTG 
TACACACGAT AATAGCTTTT ATCTTGCCAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 
CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG CAACGATTTT CCGAAATATT 
CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 
AAGGATTTAT TAATTCATCA CGATCCGTTA AGTTATATTT AA7CCTATAA AAAGCAGGCG 
TTAAATGTAA CAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA ATAAAAGTGA 
TTTGTTCAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT TAAATTCGAT ATGCTATCTG 
ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 
TTTCCGCTTC TGGCAAATCC GGCTCATGTT GCGTCATAAT CTCCGTTGCT TGATATTCTT 
TCGTATCCCT CAAATACTGA TAATTAATAT TTAATGGATT CATCACATGA CCACTTTGAA 
TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC 
GACTCTTCAA AAATTGTTCT ACCTGTTTGA TCTTGTCTTT TTGATATGCG ATATCTTCGA 
ATGtXAAGTT GAGCGCCTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTTTGAT 
CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACAATT TCATAACCAT 
GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 
TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT TGAAAAATGG TTTAGAGACA 
TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 
AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 
TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 
CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 
CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TCTCTGAGTA 



7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7300 

7860 

7920 

7980 

8040 

81Q0 

8160 

6220 

8230 

8340 

8400 

8460 

8520 

8530 

8640 

8700 

8760 

8820 

8880 
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TCAATCGTCA CACCGATGTA CACACTTTGA ACACATATTT TCAAAATGAG CATGTACATC 9000 

ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 

AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATXT 9120 

TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 

CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT ATCTATTTAA TTAAGTATTA 9240 

TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 

TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TTTAAATCGA CAGCCATAGT 9360 

GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 9420 

TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATCCATATT 94 80 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTGCAATAAT TCTACGACGA ACTGTGCAAC 9540 

&GGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 9600 

rACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCTGCTTC 9660 

AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 

CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 9780 

rACATTCGTA AGAGGACCTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 9840 

GTTTATAATC ACATCTGATG CTGGCATTGC AACTGCTTGA CGTGATGGTG TCGACGGTAG 9900 

TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9960 

TAACGGCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 10020 

3TTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10080 
TAATATCTCT AGTGGACTGT CAATTGCCCC CGCTAAAATT AATGCTATTG CATCATCGTG . 10140 

rCCTGGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT ATATCCACCT TTCTTAAGTT 10200 

GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 10260 

ITTTAACTTT CTCATATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 10320 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 10330 

3ACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATTAAATTTC 10500 

TCAAAAGCTA TATTGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10560 

&CAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 10680 
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AAATTTACCG GTCATCTTTG CAATTGGTGT CGCAATCGGA TTATCTAGAA GCGATAAAGG 
TACTGCAGGT tTAGctGCGC TGCTCGGTTT CTTAATTATG AACGCAACTA TGAATGGCTT 
ATTAACTATC ACGGGCACAT TGGCAAAAGA TCAGCTTGCA CAAAATGGAC AAGGCATGGT 
GCTCGGTATA CAAACGGTTG AAACCGGTGT TTTTGGCGGG ATTATCACAG GTATTATGAC 
CGCAATACTT CACAACAAAT ATCACAAAGT GGTATTACCA CCGTATTTAG GTTTCTTTGG 
TGGCTCTAGA TTTGTC CCTA TTGTCACAGC ATTTGCCGCA ATCTTTTTAG GTGTATTGAT 



15 



GTTTTTCATT TGGCCAAGCA TACAAGCCGG CATTTATCAT GTTGGTGGAT TTGTAACGAA 
AACAGGTGCC ATCGGTACTT TTGTTTATGG CTTCATCTTA AGATTGTTAG GTCCACTCGG 
TTTACACCAT ATTTTTTACT TACCGTTTTG GCAGACGGCA CTTGGTGGTA CTTTAGAAGT 
CAAAGGGCAC TTAGTTCAAG GTACGCAGAA CATCTTCTTT GCTCAACTTG GTGATCCAGA 
TGTGACGAAG TATTATTCAG GTGTGTCACG CTTTATGTCA GGCCGTTTTA TTACGATGAT 
GTTCGGCTTA TGTGGTGCCG CACTTGCAAT TTATCACACA GCTAAACCTG AACATAAAAA 
AGTTGTCGGC GGTTTAATGT TATCCGCTGC ACTCACTTCA TTTTTAACAG GTATTACCGA 
ACCTTTAGAG TTTAGTTTCT TGTTTGTCGC ACCTATTCTT TATGTAATCC ATGCCTTCTT 
TGATGGATTA GCATTTATGA TGGCAGACAT TTTCAACATT ACAATTGGTC AAACCTTCAG 
TGGAGGCTTT ATCGATTTCT TACTCTTTGG TGTGCTACAA GGTAATAGTA AAACAAACTA 
CCTATACGTC ATACCTATTG GAATTGTGTG GTTCTGTTTG TATTACATCG TTTTCAGATT 
CTTAATTACG AAATTTAATT TCAAAACACC TGGTCGAGAA GATAAAGCTG CAGCACAACA 
AGTTGAGGCT ACTGAAAGAG CACAAACTAT TGT7GCTGGT TTGGGAGGCA AAGATAACAT 
rGAAATCGTT GACTGTTGTG CAACGAGACT ACGCGTCACA CTTCATCAAA ATGACAAAGT 
ZGATAAAGTA TTACTCGAAA GTACTGGTGC CAAAGGTGTA ATCCAGCAAG GCACTGGTGT 
3CAAGTAATT TATGGGCCTC ACGTTACAGT TATCAAAAAT GAAATTGAAG AATTGCTCGG 
2GATTAAGAC TAACCGAAAT ATCAACAGAA CTAATGGCAA CGATGTACGA AGTAAGAAGT 
5ACATCGTTG CTTTTATTTT TAATGTTACA TTTGAAGCAT TAAGTTCATC ATGCACTGTA 
5TGAGCCCGC AAATCGCCTC TGCTAGACAA TCATCTTAAT GCTATGATTA AAGCTTAAGT 
iCCAGATTTG AATTTAATTT CAACAACGAC TTTCACTACA TTAAAAATAG GGCCACTCGA 
2ACATATAGT TGTATCAAAT AGCCCTTTAT ACAATTTTTT GGGTAAGGTT TTACAATTTT 
CGGGATGGTA TAGATTTTAT AAAAAGTTA7 TTAAGTTCTT CTGCTTCAGC CATAATATCT 
rTTAATGTTT TAGCTGAATG TGCGAACTTG CTTTGTTCTT CGTCGTTTAA TGGGATTTCT 



CTTTAGAAGT 


11280 


GTGATCCAGA 


11340 


TTACGATGAT 


11400 


AACATAAAAA 


11460 


GTATTACCGA 


11520 


ATGCCTTCTT 


11530 


AAACCTTCAG 


11640 


AAACAAACTA 


11700 


i ill i^rt.vj« x l 


11 / oU 


CAGCACAACA 


11820 


AAGATAACAT 


11880 


ATGACAAAGT 


11940 


GCACTGGTGT 


12000 


AATTGCTCGG 


12060 


AGTAAGAAGT 


12120 


ATGCACTGTA 


12180 


AAGCTTAAGT 


12240 


GGCCACTCGA 


12300 


TTACAATTTT 


12360 


CATAATATCT 


12420 


TGGGATTTCT 


12480 



10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12S00 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12560 

5 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 12840 

10 

GCAACATCGn AcgyTcGCTT AACAATAATG TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12960 

15 CAACAGGATT TGTAGC7ACC AAGAAAATAC CATGAAATTT TGATGCCATT ACTTCACCAA 13020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13080 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATT CGC 13140 

20 CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13320 

25 TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 30 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

30 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13630 

35 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG C7CAA7CGTC 13740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 

40 GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

45 ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 1410 0 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 14280 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 





GGTATTTTnG 


GAnGGGTACC 


TAAAGCAATT 


CCGGCAAAGG 


GTnAATCCAG 


GTACCGAAAT 


60 


15 


GGACTTCCCG 


TTATCGATAA 


TACCGACATA 


TATTGTGACA 


AGTAGATTTT 


ATGGACATTT 


120 




AGGCTTACTT 


TTACTTGTGA 


TAATTGCATG 


TATGTTTACT 


GGTATTTAtC 


CaTCaATACA 


1B0 




TATCATTCAA 


TTATTGATAT 


ATGTACCGTT 


TTGTTTTTTC 


TTAACTGCCt 


CGGTGACGTT 


240 


20 


ATTAACATCA ACACTCGGTG 


TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 


300 




AAGAATATTA 


TTTTACTTTT 


CACCAATTTT 


GTGGCTACCA AAGAACCATG 


GTATCAGTGG 


360 




TTTAATTCAT 


GAAATGATGA 


AATATAATCC 


AGTTTACTTT 


ATTGCTGAAT 


CATACCGTGC 


420 


25 


AGCAATTTTA 


TATCACGAAT 


GGTATTTCAT 


GGATCATTGG 


AAATTAATGT 


TATACAATTT 


480 




CGGTATTGTT 


GCCATTTTCT 


TTGCAATTGG 


TGCGTACTTA 


CACATGAAAT 


ATAGAGATCA 


540 




ATTTGCAGAC 


TTCTTGTAAT 


ATATTTATAT 


GACGAAACCC 


CGCTAACCAT 


TAATAAATGG 


600 


30 


AAGTGGGGTT 


CATTTTTGTT 


TATAATTTAA 


GTAAATAACA 


TATTAAGTTG 


GTGTATTATG 


660 




AACGTTTTAA 


TAAAGAAATT 


TTATCATTTG 


GTAGTTCGAA 


TACTTTCTAA 


AATGATTACG 


720 


35 


CCTCAAGTGA 


TTGATAAACC 


GCATATCGTA 


TTTATGATGA 


CTTTTCCAGA 


AGATATTAAG 


/ o U 




CCTATCATCA 


AAGCATTAAA 


TAATTCGTCG 


TATCAGAAAA 


CTGTTTTAAC 


AACACCAAAA 


840 




CAAGCGCCTT ATTTATCTGA 


ACTTAGCGAC 


GATGTTGATG 


TGATAGAAAT 


GACTAATCGA 


900 


40 


ACATTGGTAA 


AACAAATTAA 


GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 


960 




TACCTATTGC 


TAGGTGGATA 


TAATAAGACT 


TCTAATCAAC 


ACATTGTTCA 


AACGTGGCAT 


1020 




GCAAGTGGTG 


CATTAAAAAA 


CTTTGGCTTA 


ACAGATCATC 


AAGTCGATGT 


GTCTGACAAG 


1080 


45 


GCAATGGTTC 


AGCAGTACCG 


TAAAGTTTAT 


CAAGCGACGG 


ATTTTTACTT 


AGTGGGTTGT 


1140 




GAACAAATGT 


CACAATGTTT 


TAAACAGTCT 


TTAGGTGCAA 


CAGAAGAGCA 


AATGCTGTAT 


1200 


SO 


TTTGGGCTTC 


CGAGAATTAA 


TAAATATTAC 


ACAGCTGATA 


GAGCAACGGT 


TAAGGCAGAG 


1260 


TTAAAGGATA 


AATATGGAAT 


TACAAATAAG 


TTGGTATTAT 


ATGTACCAAC 


ATATAGAGAA 


1320 




GATAAAGCAG 


ATAATAGGGC 


TATTGATAAA 


GCTTATTTTG 


AAAAATGTTT 


ACCAGGATAT 


1380 
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15 



20 



ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAGCTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 1630 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 1740 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 1800 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 1860 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 204 0 

CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGTG 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 2220 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTGaCG 2280 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 234 0 

ATATACATTT AGTTTCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 2400 

30 GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 246 0 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2580 

TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

m 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 2760 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2980 

AATGAAACGT GTAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 2 940 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 3000 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3060 

50 ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA 


CAAGAATTAT ATGGTAAAGA 


TGCTAAATAA 


ATTATATAGA ACTATCGATA 


3300 




CTAAACGATA AATTAACTTA GGTTATTATA AAATAAATAT AAAACGGACA AGTTTCGCAG 


3360 


5 


CTTTATAATG 


TGCAACTTGT 


CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 




ATTGATTATC ATATGAACAA TAAGTGCTAA TCCAGCGACA AGGCATGTAC CACCAATGAT 


3480 




AGTGAATAAT GGATGTTCTT CCCACATACT TTTAGCAACA GTATTTGCCT TTTGAATAAT 


3540 


10 


TGGCTGATGA ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 




GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


15 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TTACCATTTA 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


3780 




CCCCATCATA 


TTACGTTGCT 


TCTCGCCACC 


AAGGTTTTTA 


TAGTCTCCTG 


CACCCATGAT 


3840 


20 


AACTTGATTA ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT AATTTGCTGT 


3900 




ATCACTTGAT CCAGTTTTTA AACCATCTGT ACCCGGCAAA CTCATTTTTG CACCTTCCAA 


3960 




TGAAAAGTTC 


AATGTGTAAT ACGTAACTGC 


ATGCGTTGTT 


GGTGCTAACT 


GCTTTGTAAA 


4020 


25 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 


AGTCTCTAGC 


4030 




AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 




ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4200 


30 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTTTTTTGAA ACCTTCTTAG 


CTAAAATTAA 


4260 




TGCCGCGGCA 


TTACTAGAAT 


TAGATACTGT AATTTGTAAT 


AGGTCTGCGA 


TTGTCCATAC 


4320 




TTGTCCAGGA TATAGTTTCG 


TATTACTCAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


35 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA 


AAGCTGCCCC 


TTATTTACAG 


CTTCCAATGT 


4440 




TAAGTACATT 

• 


GTCATTAATT 


TAG T CAT AG A 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


40 


TTGATACAGT 


AATTGTCCAG 


TTTGACTTAC 


ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 




AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TGTTAAACAT 


AAAATGATGA 


TAATAGATAT 


4680 


45 


TAAATTTTTC 


ATAAAGCGTT 


AATCTTCCCT 


TTTCCAATTC 


TTAAATATTC 


CCTAAAAGCA 


4740 




ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4800 




AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA 


ATAATGCAAA 


TATATTACGT 


GTGTAGCTAA 


4860 


50 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 




TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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15 



AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTTCT TATTTAAAAA ACTATCATGG 5160 

CCAGTGGGTC TTATCGTTGC AGCTATCACT AT7TCATCAC TAGGGAGCTT AAGTGGACTA 5220 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TATCAATTGG 5280 

AATCtAATCG CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 5340 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG ATTATTTATG CGATACGCTC AGTTTTATGG 5400 

GAGCATATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 5460 

AGTCGATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTCACAAAA GCTACCTmAC 5520 

TTATTACCAT CAATCGTTAC ATtAGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 5580 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTCG TTTTaATTAT GATTCCTCTA 5640 

2Q GGTCGTATTA TGCAAAAGAT ATCGACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 5700 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 5760 

GAATTAGATA ATGCACATAA AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 582 0 

25 AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTGCTAAC AATTGCAATT 58 80 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 600 0 

ACAGATTATA AAAAGGCAGT CGGTGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 6060 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 612 0 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 618 0 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 624 0 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 6300 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 63 6 0 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 642 0 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTTATCATG 64 80 

45 CAATTTGATG AAGGATATGA CACGCTTG7A GGTGAACGAG GATTGAAACT GTCTGGCGGA 654 0 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 66 00 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

50 ACATTGATGG AAGGTAGAAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6720 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCATTCAGAA 6780 
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TTTl'ATATAT ATAAGTAAGC 


TTGGAGCAAA 


TACACATATA 


CCATCGAGGA AATTAAAGTG 


6900 




TGGCACATTG ATGGATATAG 


ATGTtAATAA 


ATTGCTTCAA GCTTTTGTCT 


ATTTTAAATC 


6960 


5 


ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 


7020 






ul 1A1AAGCA 


ft ft O It T* 1% /"» H T* ^ 

AACATACATA 


TATTAAATAC 


TGTAGCCACG 


7080 




AVJiLAlnAil LI ILrtiAi 1 i 


*Pft f*T\ *T*ft ^/^ft ft 

TACATAGCAA 


TTTAACTGAT 


TTTAGAGTCC 


ACGGTACAGA 


7140 


a) 

IV 


AG 1 i IvjAI AI riUAAIGT.il 


CTAAAli ITT 


AAAAAATTAA 


ATCATAGGTG 


GGTGCCAAAT 


7200 




G 11111 A 1 1 A ATCAA CAT T A 


TTGGTCTAAT 


TGTATTTCTT 


GGTATTGCGG 


TATTATTTTC 


7260 


15 


AAGAGATCGC AAAAATATCC 


AATGGCAATC 


AATTGGGATC 


TTAGTTGTTT 


TAAACCTGTT 


7320 


TTTAGCATGG TTCTTTATTT 


ATTTTGATTG 


GGGTCAAAAA 


GCAGTAAGAG 


GAGCAGCCAA 


73S0 




TGGTATCGCT TGGGTAGTTC 


AGTCAGCGCA 


TGCTGGTACA 


GGTTTTGCAT 


TTGCAAGTTT 


7440 


20 


^* ft /*ft ft ft ^1/^ (Wit « « m ii rf/-* * »tV»^ 

GACAAATGTT AAAATGATGG 


ATATGGCTGT 


TGCAGCCTTA 


TTCCCAATAT 


TATTAATAGT 


7500 




GCCATTATTT GATATCTTAA 


TGTACrTTAA 


TATTTTACCG 


AAAATTATTG 


GAGGTATTGG 


7560 




TTGGTTACTA GCTAAAGTAA 


CAAGACAACC 


TAAATTCGAG 


TCATTCTTTG 


GGATAGAAAT 


7620 


25 


GATGTTCTTA GGAAATACTG 


AAGCATTAGC 


CGTATCAAGT 


GAGCAACTAA 


AACGTATGAA 


7680 




TGAAATGCGT GTATTAACAA 


TCGCAATGAT 


GTCAATGAGC 


TCTGTATCGG 


GAGCTATTGT 


7740 




AGGTGCGTAT GTACAAATGG 


TACCAGGAGA 


ACTGGTACTA 


ACGGCAATTC 


CACTAAATAT 


7300 


30 


CGTTAACGCG ATTATTGTGT 


CATGCTTGTT 


GAATCCAGTA 


AGTGTTGAAG 


AGAAAGAAGA 


7860 




TATTATTTAC AGTCTTAAAA ACAATGAAGT 


TGAACGTCAA 


CCATTCTTCT 


CATTCCTTGG 


7920 




AGATTCTGTA TTAGCAGCAG 


GTAAATTAGT 


ATTAATCATC 


ATCGCATTTG 


TTATTAGTTT 


7990 




TGTAGCGTTA GCTGATCTAT 


TTGATCGTTT 


TATCAATTTG 


ATTACAGGAT 


TG AT AG CAGG 


3040 




ATGGaTAGGC ataaaaggta GTTTCGGTTT aaaccaaatt 

* 


TTAGGTGTGT 


TTATGTATCC 


aico 


40 


ATTTGCGCTA ttactcggtt 


TACCTTATGA 


TGAAGCGTGG 


1 JAjLj i AGCAC 


AACAAATGGC 


8160 


TAAGAAAATT GTTACaaATG 


AATTTGTTGT 


TATGGGTGAA 


ATTTCTAAAG 


ATATTGCATC 


8220 




TTATACACCA CACCATCGTG 


CGGTTATTAC 


AACATTCTTA 


ATTTCATTTG 


CAAACTTCTC 


8280 


45 


AACGATTGGT atgattatcg 


GTACATTGAA 


AGGCATTGTT 


GATAAAAAGA 


CATCAGACTT 


8340 




tgtatctaaa tatgtaccta 


TGATGCTATT 


ATCAGGTATC 


CTAGTTTCAT 


TATTAACAGC 


8400 




*. 

AGCTTTCGTT GGTTTATTTG 


CATGGTAATA 


TGTCGAAGAG 


TGACTATGAT 


AATACA'rrrr 


8460 


SO 


AACTAATAAA TATGTCCAGG 


CATGTCGTCT 


ATTGATATAG 


GTGAGATGCT 


TGGACTTTTT 


8520 




TATTATTGAT ATAAAGGTAT 


nTAAATATTT 


TTAAAGTTAC 


CGAAATTGAA 


GCATTATAAA 


8580 
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25 



GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8700 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 87 SO 

5 ATGAAAGTAA ATTAAAAAT 8779 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

ICTTGCACAC CCGAAAATGT 60 

20 ^L'Ul'^i'iA a^UbAiUUib UAUATAAALrT AA'ITLiTTUAA AAAAATGCTG GCATTGGTTC 120 

bTCGTAACTC ACGAACAAGC 18 0 

rAAAGCGAAT ATCAATATTT 24 0 

CTTCAAAAG AAATAGTAGA 3 00 

CCATTATAA AAAATGGAAA 3 60 

GCTCAGCAA TTATGGGAGC 42 0 

TGACTGGTG TACATGAAAA 4 80 

GAGTAGCAG CAACAAATGC 540 

.TCGAGTTAA ACGATGACCG 600 

CAGTAGTCA AATCAACACC 6 60 

TTTCTACAA TTTTAATTTC 720 

AATCAATGA AAAAAGGTTC 7 30 

AAACAATTA GACCAACTAC 840 

ATGGTGTAC CAAATCAACC 900 

45 AGGAGCAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

ATGAAGCCT TAAGTACTGG 1020 

CTTCATCAC ATGACCTAGA 1080 

SO TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 1140 

GAATATTTTA AAT AT AG CAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 



55 



30 
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40 



GTTGCAGTAG 


TCAAAGAATT 


AAACAAGGTG 


AAGGcGTGTA 


GCGTAAGTTA 


aCGGATGCAG 


GACATAAAGT 


AATTGTTGAA 


AGGATTTTCT 


AACGATATGT 


ATGAAAAAGA 


AGGCGCTAAG 


ATGGGAAGCT 


GATCTTGTTA 


TCAAAGTAAA 


AGAACCTCAT 


CAAAAAGAAT 


CAAATTATCT 


GGGGATTTTT 




AAAAATGCAA 


GAAGTTGGTG 


TAACTGCGAT 


TAGTGGTGAA 


AGCAGAATTA 


TTAGCGCCAA 


TGAGTGCTAT 


AGCAGGTCAA 


TTACTACTCT 


GAAGCACAAC 


ATGGTGGTCA 


AGGTACTTTA 


TGTGGATATA 


CCTGGTAGTA 


CATATGTGAT 


TTTCGGTGGT 


AGCAAATGTT 


GCCTTGGGAC 


TAAATGCTAA 


AGTAATCATT 


CATTAAATAT 


CTTGAAGATA 


TGTATGCAGA 


AAAAGATGTC 


AGAftAATTTA GCAGAACAAA 


TTAAGAAAGC 


AGATGTATTT 


AGGTGCGAAA 


CCGCCAAAAT 


TGGTTACTCG 


TGAGATGGTT 


AGTATTAATC 


GATATAGCTA 


TTGACCAAGG 


TGGAACTATT 


AATTTCTGAT 


CCAGTGTATG 


AAGAAGAAGG 


TGTGATTCAT 


AGGAGCAGTC 


CCAAGAACTT 


CAACAATGGC 


ATTAGCACAA 


AGAAATTTGT 


GACAAAGGCT 


TAGAACAAGC 


AATTAAAGAT 


TGTAAACATT 


TACCAAGGAC 


AAGTGACAAA 


TCAAGGATTA 
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TCGAAGAAGC 


TAAAGCAAGC 


ATTAAACCAT 


TTATTCGTCG 


AACACCTCTA 


ATTAAATCAA 


1320 




TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 


1380 


5 


TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 


1440 




CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 


1500 




AACAGCTAAA TTATTAGGCA 


TTGATGCAAC 


GATTGTAATG 


CCTGAAACAG 


CACCACAAGC 


1560 


10 


GAAACAACAA GCAACAAAAG 


GCTATGGGGC 


AAAGGTTATT 


TTAAAAGGTA AAAACTTTAA 


1620 




CGAAACTAGA 


CriTATATGG 


AAGAATTAGC 


GAAAGAAAAT 


GGCATGACAA 


TCGTTCATCC 


1680 


15 


ATATGACGAT 


AAGTTTGTAA 


TGGCAGGCCA 


AGGAACAATT 


GGTTTAGAAA 


TTTTAGATGA 


1740 


TATTTGGAAT 


GTGAATACAG 


TCATCGTACC 


AGTTGGCGGT 


GGAGGATTAA 


TTGCAGGTAT 


1800 




TGCCACCGCA TTAAAATCAT 


TTAACCCTTC 


AATTCATATT 


ATCGGTGTTC 


AATCTGAGAA 


1860 


20 


TGTTCATGGT 


ATGGCTGAGT 


CTTTCTATAA GAGAGATTTA ACTGAACATC 


GAGTGGATAG 


1920 




CACAATAGCA 


GATGGTTGTG 


ATGTAAAAGT 


TCCTGGTGAA 


CAAACATATG 


AAGTAGTTAA 


1980 




ACATTTAGTA 


GATGAATTTA 


TTCTTGTTAC 


TGAAGAAGAA ATTGAACATG 


CTATGAAAGA 


2040 


25 


TTTAATGCAG 


CGTGCCAAAA 


TTATTACTGA 


AGGTGCAGGC 


GCATTACCAA 


CAGCTGCAAT 


2100 




TTTAAGTGGA 


AAAATAAACA 


ATAAATGGCT 


TGAAGATAAA 


AATGTTGTTG 


CATTAGTTTC 


2160 




AGGCGGGAAT 


GTTGACTTAA 


CTAGAGTTTC 


AGGTGTCATT 


GAACATGGAC 


TGAATATTGC 


2220 


30 


AGATACAAGC 


AAGGGTGTGG 


TAGGTTAAAA 


CATTTAATCT 


TAAAAATGAG 


GTGTAATTAT 


2280 




GTCAAATGGT 


AAAGAATTAC 


AAAAAAATAT 


AGGTTTCTTC 


TCAGCGTTTG 


CTATTGTTAT 


2340 




GGGGACAGTT 


ATTGGTTCAG 


GAGTATTCTT 


TAAAATATCA 


AACGTAACAG 


AAGTAACAGG 


2400 


35 


AACAGCAGGA ATGGCCTTGT 


TTGTATGGTT 


CCTAGGCGGC 


ATCATTACCA 


TTTGTGCGGG 


2460 




GTTAACAGCA GCAGAACTTG 


CTGCTGCAAT 


CCCTGAAACA 


GGTGGCTTAA 


CGAAGTATAT 


2520 


40 


AGAATATACA 


TACGGTGATT 


TCTGGGGCTT 


CCTATCAGGT 


TGGGCGCAAT 


CATTTATTTA 


2580 


TTTTCCAGCT 


AACGTAGCAG 


CATTGTCTAT 


CGTATTTGCG 


ACACAGCTAA 


TTAA'ITTATT 


2640 




CCATTTATCT 


ATAGGTTCGT 


TAATACCAAT 


AGCAATCGCA 


TCTGCGTTAT 


CTATTGTGTT 


2700 


45 


GATAAATTTC 


CTAGGTTCAA 


AAGCAGGCGG 


AATTTTACAA 


TCAGTTACTT 


TAGTAATTAA 


2760 




ACTGATTCCA ATCATCGTTA 


TTGTAATTTT 


TGGTATTTTT 


CAATCTGGAG 


ATATCACTTT 


2820 




TTCATTAATT 


CCAACTACAG 


GTAATTCaGG 


AAATGGCTTC 


TTTACAGCAA 


TTGGTAGTGG 


2380 


50 


TTTATTAGCA 


ACTATGTTTG 


CATATGATGG 


TTGGATTCAT 


GTAGGAAATG 


TTGCGGGGGA 


2940 




ACTTAAAAAT 


CCTAAACGCG 


ATTTACCTTT 


AGCGA7TTCA 


GTTGGTATCG 


GTTGTATTAT 


3000 
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TGGTAATTTA AATGCAGCTT CAGATACATC AAAAATATTA TTTGGTGAAA ATGGCGGTAA 3120 

GATTATTACA ATCGGTATAT TAATTTCTGT TTATGGTACG ATCAATGGCT ATACTATGAC 3180 

TGGTATGCGC GTACCATATG CAATGGCTGA AAGAAAATTA TTGCCATTTA GCCATTTATT 3240 

CGCAAAATTA ACAAAATCTG GCGCACCATG GTTTGGCGCA ATTATACAAC TTATAATCGC 3300 

TATCATCATG ATGTCAATGG GAGCATTTGA TACAATTACA AATATGTTAA TCTTTGTTAT 3360 

TTGGTTGTTC TATTGTATGT CATTTGTTGC GGTAATAATT TTAAGAAAAC GTGAACCAAA 3420 

TATGGAACGA CCATATAAAG TACCGTTATA TCCGATCATA CCTTTAATTG CTATTTTGGC 3480 

AGGATCATTT GTATTAATTA ATACACTGTT TACACAATTT ATATTAGCAA TCATTGGAAT 3540 

TCTAATAACA GCACTTGGTA TACCAGTTTA TTACTATAAA AAGAAACAAA AAGCAGCATA 3600 

AGGTAAGATA ACTAGCATTG AGAATAAATG GATGGACTAC TAATAAATTT AAAGTTTTAC 3660 

ACATTAAAAT CAAAAACCAT TCAATTATTC TATGGAACAG ACAAATTTCT GTTATGGAAT 3 720 

TTGTCTGTTT TTCAAAAGTA TAGGGAGGCA AATAGAGATG GAAAAGCCGT CAAGAGAGGC 3780 

ATTTGAAGGC AATAATAAGT TGTTAATAGG AATTG7TCTA AGTGTAATAA CGTTTTGGCT 3840 

25 ATTTGCACAA TCATTGGTTA ATGTTGTACC AATACTTGAA GATAGTTTCA ATACAGATAT 3900 

TGGAACGGTT AATATCGCCG TTAGTATAAC TGCTTTATTT TCAGGAATGT TTGTAGTAGG 3960 

AGCAGGTGGT CTTGCTGATA AATATGGCAG AATTAAACTC ACGAACATTG GTATTATCTT 4020 

30 AAATATATTA GGTTCATTA? TAATCATTAT TTCAAATATT CCTTTATTAC TTATTATAGG 4080 

AAGATTAATT CAAGGACTTT CAGCAGCATG TATTATGCCT GCAACTTTGT CTATTATTAA 4140 

GTCATATTAC ATTGGGAAAG ATAGACAACG CGCTTTAAGT TATTGCTCAA TTGGCTCATG 4 200 

GGGCGGCTCT GGTGTTTGTT CATTTTTTGG AGGTGCAGTT GCAACGCTTT TAGGTTGGCG 4260 

TTGGATTTTC ATCCTATCAA TTATAATTTC ATTAATTGCA CTGTTTCTTA TTAAAGGCAC 4320 

ACCTGAAACT AAATCTAAAT CGATTTCTCT AAATAAATTT GACATTAAAG GTCTGGTTCT 43 30 

TTTAGTCATT ATGCTCCTCA GTTTAAATAT TTTAATTACT AAAGGATCAG AATTAGGTGT 444 0 

AACCTCACTT CTTTTTATTA CTTTATTAGC TATTGCAATT GGATCTTTTA GTTTATTTAT 4500 

AGTTCTTGAA AAGCGTGCTA CAAATCCTTT AATCGATTTT AAATTATTTA AAAATAAAGC 4560 

TTACACAGGT GCAACAGCTT CAAACTTTTT GTTAAATGGT GTTGCAGGAA CATTAATAGT 4 620 

AGCCAACACA TTTGTTCAAA GAGGTTTAGG ATATTCTTCA TTGCAAGCAG GAAGTTTATC 4650 

SO AATCACTTAT TTAGTAATGG TACTAATTAT GATTCGTGTT GGTGAAAAGT TACTTCAAAC 4740 

ACTCSGATGC AAGAAACCAA TGTTAATTGG AACAGGAGTT CTTATTGTCG GAGAATG7C7 4800 
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ATTCTTTGGT TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 920 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT ' CTGCATTAGG 4980 

7GGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 5040 

CATTTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGGa ATATTATCaT 5100 

TCGTTATCAT TTTGtTACTT GTGcCTAAAC mAAACGACAC TCAATTATGA TAATTGAGAA 5160 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 5220 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACTCTTTTAC GCTACTTTAT 52 80 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 534 0 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 5400 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 54 60 

20 GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 5520 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 5580 

ATTCGTAAAT ATACAGTTGG TACATTTTCA ACTGTCATTG CGACATTGGT ATTTTTAGGA 5640 

25 TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 5700 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5760 

AATTCACAAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 5820 

CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 58 80 

GCATCTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTGCGACAAC ACAACCAGAT 594 0 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6000 

AATAGAGCGG CTCATGTAGA AAATCATGAA GCAAATGTAG TAACAGCTTC AGATTCATCT 6060 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CCGTGAAAAT GCAGATTCTG GCACATTTAA CTATGTAAAA 6180 

GGCATTTTTG ATAAGATTAA TACGTTATTA GGOVGTAATG ATCCAATAAA CAATAAAGAC 6240 

TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATTCGTAC AATGCCTCAA 6300 

45 CGTCAACAGA CTAGC CGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 6360 

GCAGAGCCTA GATCAGTATC AGACTATCAA AATGCAAATT CAT CAT ATT A 7GTTGAAAAT 6420 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 64 80 

50 CCATATAATT TAC CAACTAC ACCATGGAAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 6540 

GCTCTTATGA CAGCGAAACA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 6600 
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15 



20 



25 



GTAGGAAGAA CTGACTTTGT AACAGTTAAT TCAGATGGAA CAAATGTACA ATGGAGTCAT 6720 

GGAGCAGGAG CAGGTGCAAA TAAACCACTT CAACAAATGT GGGAATATGG AGTAAATGAT 6780 

5 CCTCATCGTT CACATGACTT TAAAATAAGA AATAGAAGTG GCCAAGTAAT ATATGACTGG 684 0 

CCAACTGTCC ATATTTATTC TTTAGAAGAT TTATCTAGAG CGAGTGATTA TTTTAGTGAA 6900 

GCTGGAGCGA CACCTGCTAC TAAAGCTTTT GGTAGACAAA ATTTTGAATA TATTAATGGT 6960 

10 CAAAAACCTG CTGAATCACC GGGTGTTCCT AAAGTTTATA CTTTCATCGG TCAAGGTGAT 7020 

GCAAGTTATA CAATTTCATT TAAAACACAA GGTCCAACTG TTAATAAATT GTACTATGCA 7080 

GCAGGTGGGC GTGCTTTAGA GTACAATCAA TTATT7ATGT ACAGTCAACT ATACGTCGAA 7140 

TCAACGCAAG ACCATCAACA ACGTCTTAAT GGTTTAAGAC AAGTGGTTAA TCGTACATAT 7200 

CGCATAGGTA CAACTAAACG TGTAGAAGTG AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 7260 

TTAGAAAGTA CAAACCTAAA TATAGATGAT TTTGTTGATG ATCCTTTAAG TTATGTTAAG 7320 

ACGCCGAGTA ATAAAGTGTT AGGATTTTAT TCGAATAATG CAAATACTAA TGCTTTTAGA 7380 

CCGGGTGGAG CCCAACAATT AAATGAATAT CAATTAAGTC AATTATTTAC TGATCAAAAA 7440 

TTACAAGAAG CAGCAAGAAC TAGAAACCCA ATAAGATTAA TGATTGGTTT CGACTATCCT 7500 

GATGCTTATG GTAATAGTGA AcTTTAGTTC CTGTTAACTT AACGGTATTA CCTGAAATCC 7560 

AACATAATAt TaAATTCTTT AAAAATGACG ATACTCAAAA TATTGCTGAA AAACCATTTT 7620 

30 CAAAACAAGC TGGGCATCCA GTTTTCTATG TATATGCAGG TAACCAAGGG AATGCTTCCG 7680 

TGAATTTAGG TGGTAGCGTA ACATCTATTC AACCATTACG TATTAATTTA ACAAGTAATG 774 0 

AGAATTTTAC AGATAAAGAT TGGCAAATTA CAGGTATTCC GCGTACATTA CACATTGAAA 7800 

35 ACTCGACAAA TAGACCTAAT AATGCCAGAG AACGCAATAT TGAACTTGTT GGTAACTTAT 7360 

TACC^GGGGA TTACTTTGGA ACGATACGTT TTGGACGTAA AGAACAATTA TTCGAAATTC 7920 

GTGTTAAACC ACATACACCA ACAATTACAA CGACAGCTGA GCAATTAAGA GGTACAGCAT 7 980 

TACAAAAAGT GCCTGTTAAT ATTTCGGGAA TACCGTTGGA TCCATCGGCA TTGGTTTATT 804 0 

TAGTTGCACC AACAAATCAA ACTACGAATG GTGGTAGTGA GGCAGATCAA ATACCATCTG 8100 

GTTATACGAT ACTTGCGACT GGTACACCTG ATGGGGTGCA TAATACAATT ACTATACGAC 8160 

CGCAAGATTA TGTTGTATTC ATACCACCTG TAGGTAAACA AATTAGAGCA GTAGTTTATT 8220 

ATAATAAAGT AGTTGCATCT AATATGAGTA ATGCTGTTAC TATTTTGCCA GATGACATTC 8230 

so CACCAACAAT CAATAATCCT GTTGGAATAA ATGCCAAATA CTATCGAGGC GACGAAkCAA 834 0 

CTTTACAATG GGTGTCTCTG ATAGACATTC TGGTATAAAA AATACAACTA TTACGACATT 8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT 
GACAGaCAAT GTCAATAATA CGACAAATGA 
5 AGGTAAAATT AGTGAAGATG CTCATCCGAT 

AGTCAATCCG ACTGCTGTAT CTAATGATGA 
TAAAAACCAA AATATAAGAG GATATTTAGC 

10 

TGGTAATGTC ACATTACATT ACCGTGATGG 
GATGACATAC GAACCAGTTG TGAAACCTGA 
AACGGTAACG ATTGCTAAAG GACAATCATT 

15 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG 
TATTCCAACT GCACAAGAAG TTAGTCAAAT 

2Q TGCTACAAAT GCGTATCATA AAGATAGTGA 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT 
TGATGAAATC TCAAAAGTAA AACAAG CATT 

25 TGCCGAAGGT GATATTTCAG TTACAAATAC 

AGTAAATATT AATAAAGGTC GATTAACGAA 
TTTCTTGCGT TGGGTTAATT TCCCACAAGA 

30 TGCAAACAGA CCAACAGATG GTGGTTTATC 

TCGTTATGAT GCTACATTAG GTACTCAAAT 
AGCAACAACT ACAGTGCCTG GATTGCGAAA 

35 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC 
TGATGGTCAA CGTCAATTTA CGTTGAATGG 
CCCTTCAAAC GGTTATGGTG GGCAACCTGT 

40 

TAACTCAACT GTTGTTAACG TAAACGAACC 
TGACCACGTT GTAAAAAGTA ATTCTACACA 

45 GTTATACTTA ACGCCATATG GTCCAAAACA 

AAATACTACT GACGCTATTA ACATTTATTT 
TTCAGTAGGT AATTACACTA ATCATCAAGT 

SO TACAGCGAAT GATAACTTTG GTGTGCAATC 

AGGTACTGTT GATAATAACC ATCAACATGT 

55 



TAACAGTGAT ATTACATTTA AAGTGTCAGC 8 520 

TAGTCAATCT AAACATGTTT CAATTCATGT 8580 

TGTATTAGCA AATACTGAGA AAGTTGTAGT 8640 

AAAGCAAAGC ATAATTACTG CCTTTATGAA 8700 

ATCAACTGAT CCAGTAACTG TCGATAATAA 8760 

CTCATCGACA ACGCTTGATG CTACAAATGT 8820 

ATACCAAACT GTCAATGCTG CTAAAACAGC 8880 

TAGTATTGGT GATATTAAAC AATATTTTAC 8940 

CACATTTACA AATATTACAT CTGATAGAAC 9000 

GAACGCAGGC ACGCAGTTAT ACCATATAAC 9060 

AGACTTCTAT ATTAGTTTGA AAATCATCGA 9120 

ATATCGTACA TCAACATATG ATTTAACTAC 9180 

TATTAATGCA AATAGAGATG TAATTACGCT 9240 

ACCTAATGGT GCTAATGTAA GTACTATTAC 9300 

ATCATTCGCG TCAAACCTAG CTAATATGAA 93 60 

TTATACAGTG ACATGGACGA ATGCAAAAAT 9420 

ATGGTCTGAT GACCATAAAT CTTTAATTTA 94 80 

TACGACGAAT GATATTTTAA CAATGTTAAA 9540 

TAACATTACT GGTAATGAAA AATCACAAGC 9600 

GACTGGTTAT TCACAATCAA ATGCGACAAC 9660 

TCAAGTGATT CAAGTGTTAG ACATCATCAA 9720 

TACAAATTCA AATACTCGTG CAAACCATAG 97 80 

GGCAGCTAAT GGTGcTGGCG CATTTACAAT 9840 

TAATGCAAGT GATGCAGTTT ATAAAGCACA 9900 

ATATGTTGAA CATTTAAATC AAAATACAGG 9960 

TGTACCAAGT GACTTAGTGA ATCCAACAAT 10020 

GTTCTCAGGT GAAACATTTA CAAATACTAT 10080 

TGTAACTGTA CCAAATACAT CACAAATTAC 10140 

TTCTGCAACG GCACCAAATG TGACATCAGC 10200 
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GTTCAATGTA ACAGTGAAAC CTTTGCGTGA TAAATATCGA GTTGGTACTT CATCAACGGC 10320 

TGCTAATCCT GTGAGAATTG CCAATATTTC GAATAATGCG ACAGTATCAC AAGCTGATCA 10380 

AACGACAATT ATTAATTCGT TAACGTTTAC TGAAACAGTA CCAAATAGAA GTTATGCAAG 10440 

AGCAAGTGCG AATGAAATCA CTAGTAAAAC AGTTAGTAAT GTCAGTCGTA CTGGAAATAA 10500 

TGCCAATGTg CACAGTAACT GTTACTTATC AAGATGGAAC AACATCAACA GTGACTGTAC 10560 

CTGTAAAGCA TGTCATTCCA GAAATCGTTG CACATTCGCA TTACACTGTA CAAGGCCAAG 10620 

ACTTCCCAGC AGGTAATGGT TCTAGTGCAT CAGATTACTT TAAGTTATCT AATGGTAGTG 10680 

ACATTGCAGA TGCAACTATT ACATGGGTAA GTGGACAAGC GCCAAATAAA GATAATACAC 10740 

GTATTGGTGA AGATATAACT GTAACTGCAC ATATCTTAAT TGATGGCGAA ACAACGCCGA 10900 

TTACGAAAAC AGCAACATAT AAAGTAGTAA GAACTGTACC GAAACATGTC TTTGAAACAG 10860 

CCAGAGGTGT TTTATACCCA GGTGTTTCAG ATATGTATGA TGCGAAACAA TATGTTAAGC 10 920 

CAGTAAATAA TTCTTGGTCG ACAAATGCGC AACATATGAA TTTCCAATTT GTTGGAACAT 10980 

ATGGTCCTAA CAAAGATGTT GTAGGCATAT CTACTCGTCT TATTAGAGTG ACATATGATA 11040 

ATAGACAAAC AGAAGATTTA ACTATTTTAT CTAAAGTTAA ACCTGACCCA CCTAGAATTG illOO 

ACGCAAACTC TGTGACATAT AAAGCAGGTC TTACAAACCA AGAAATTAAA GTTAATAACG 11160 

TATTAAATAA CTCGTCAGTA AAATTATTTA AAGCAGATAA T AC AC C ATT A AATGTCACAA 11220 

ATATTACTCA TGGTAGCGGT TTTAGTTCGG T7GTGACAGT AAGTGACGCG TTACCAAATG 11280 

GCGGAATTAA AGCAAAATCT TCAATTTCAA TGAACAATGT GACGTATACG ACGCAAGACG 11340 

AACATGGTCA AGTTGTTACA GTAACAAGAA ATGAATCTGT TGATTCAAAT GACAGTGCAa 11400 

CAGTAACAGT GACACCACAA TTACAAGCAA CTACTGAAGG CGCTGTATTT ATTAAAGGTG 1146 0 

GCGA&GTTT TGATTTCGGA CACGTAGAAA GATTTATTCA AAACCCGCCA CATGGGGCAA 11520 

* 

CGGTTGCATG GCATGATAGT CCAGATACAT GGAAGAATAC AGTCGGTAAC ACTCATAAAA 11580 

CTGCGGTTGT AACATTACCT AATGGTCAAG GTACGCGTAA TGTTGAAGTT CCAGTCAAAG 1164 0 

TTTATCCAGT TGCTAATGCA AAGGCGCCAT CACGTGATGT GAAAGGTCAA AATTTGACTA 11700 

ATGGAACGGA TGCGATGAAC TACATTACAT TTGATCCAAA TACAAACACA AATGGTATCA 11760 

CTGCAGCATG GGCAAATAGA CAACAACCAA ATAACCAACA AGCAGGCGTG CAACATTTAA 11820 

ATGTCGATGT CACATATCCA GGTATTTCAG CTGCTAAACG AGTTCCTGTT ACTGTTAATG 11380 

TATATCAATT TGAATTCCCT CAAACTACTT ATACGACAAC GGTTGGAGGC ACTTTAGCAA 11940 

GTGGTACGCA AGCATCAGGA TATGCACATA TGCAAAATGC TACTGGTTTA CCAACAGATG 12000 
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TGAATAAACC GAATGTGGCT AAAGTCGT7A ACGCAAAATA TGACGTCATC TATAACGGAC 12120 

ATACTTTTGC AACATCTTTA CCAGCGAAAT TTGTAGTAAA AGATGTGCAA CCAGCGAAAC 12180 

CAACTGTGAC TGAAACAGCG GCAGGAGCGA TTACAATTGC ACCTGGAGCA AACCAAACAG 12240 

TGAATACACA TGCCGGTAAC GTAACGACAT ACGCTGATAA ATTAGTTATT AAACGTAATG 12300 

GTAACGTTGT GACGACATTT ACACGTCGCA ATAATACGAG TCCATGGGTG AAAGAAGCAT 12360 

CTGCAGCAAC TGTAGCAGGT ATTGCTGGAA CTAATAATGG TATTACTGTT GCAGCAGGTA 12420 

CTTTCAACCC TGCTGATACA ATTCAAGTTG TTGCAACGCA AGGAAGCGGA GAGACAGTGA 124 80 

GTGATGAGCA ACGTAGTGAT GATTTCACAG TTGTCGCACC ACAACCGAAC CAAGCGACTA 12540 

CTAAGATTTG GCAAAATGGT CATATTGATA TCACGCCTAA TAATCCATCA GGACATTTAA 12600 

TTAATCCAAC TCAAGCAATG GATATTGCTT ACACTGAAAA AGTGGGTAAT GGTGCAGAAC 12660 

ATAGTAAGAC AATTAATGTT GTTCGTGGTC AAAATAATCA ATGGACAATT GCGAATAAGC 12720 

CTGACTATGT AACGTTAGAT GCACAAACTG GTAAAGTGAC GTTCAATGCC AATACTATAA 12780 

AACCAAATTC ATCAATCACA ATTACTCCGA AAGCAGGTAC AGGTCACTCA GTAAGTAGTA 12840 

ATCCAAGTAC ATTAACTGCA CCGGCAGCTC ATACTGTCAA CACAACTGAA ATTGTGAAAG 12900 

ATTATGGTTC AAATGTAACA GCAGCTGAAA TTAACAATGC AGTTCaAGTT GCTAATAAAC 12960 

GTACTGCAAC GATTAAAAAT GGCACAGCAA TGCCTACTAA TTTAGCTGGT GGTAGCACAA 13020 

CGACGATTCC TGTGACAGTA ACTTACAATG ATGGTAGTAC TGAAGAAGTA CAAGAGTCCA 13080 

TTTTCACAAA AGCGGATAAA CGTGAGTTAA TCACAGCTAA AAATCATTTA GATGATCCAG 13140 

TAAGCACTGA AGGTAAAAAG CCAGGTACAA TTACGCAGTA CAATAATGCA ATGCATAATG 13200 

CGCAACAACA AATCAATACT GCGAAAACAG AAGCACAACA AGTGATTAAT AATGAGCGTG 1326 0 

CAACACCACA ACAAGTTTCT GACGCACTAA CTAAAGTTCG TGCAGCACAA ACTAAGATTG 13320 

ATCAAGCTAA AGCATTACTT CAAAATAAAG AAGATAATAG CCAATTAGTA ACGTCTAAAA 133 SO 

ATAACTTACA AAGTTCTGTG AACCAAGTAC CATCAACTGC TGGTATGACG CAACAAAGTA 13440 

TTGATAACTA TAATGCGAAG AAGCGTGAAG CAGAAACTGA AATAACTGCA GCTCAACGTG 13500 

TTATTGACAA TGGCGATGCA ACTGCACAAC AAATTTCAGA TGAAAAACAT CGTGTCGATA 13560 

ACGCATTAAC AGCATTAAAC CAAGCGAAAC ATGATTTAAC TGCAGATACA CATGCCTTAG 13620 

AGCAAGCAGT GCAACAATTG AATCGCACAG GTACAACGAC TGGTAAGAAG CCGGCAAGTA 13630 

TTACTGCTTA CAATAATTCG ATTCGTGCAC TTCAAAGTGA CTTAACAAGT GCTAAAAATA 13740 

GCGCTAATGC TATTATTCAA AAGCCAATAA GAACAGTACA AGAAGTGCAA TCTGCGTTAA 13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 13 920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 13980 

GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 14040 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 14220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 14280 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 14340 

AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 144 00 

AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 14460 

ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 14520 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 145 BO 

ATTTAGATCA TGCACGTCAA GCTTTAACAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 1464 0 

CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14700 

TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14760 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14 820 

CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG CCAGCGTTAA 14 380 

CAACATTACA TGGTGCATCT AACTTAAACC AAGCACAACA AAATAATTTC ACGCAACAAA 14 940 

TTAATGCTGC TCAAAATcAT GctGCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 15000 

ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 15060 

AAAATTACAC TGACGCAACA CCAGCTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 

CGAAAACAGA AG C AACAAAT GCGATTACGC ATGCAAGTGA TTTAAACCAA GCACAAAAGA 15300 

ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 

AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT GCTAATCATA 15420 

ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 154 30 

ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

CACCAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

5 GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTA7GCG GATGCAGATA 15840 

CAGCTAAACA AAATGCATAT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15960 

10 

ATAAAAATGC ATTAAATGGT TATGAAAAAT TAGCACAATC TAAAACAGAT GCTGCAAGAG 16020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 160 BO 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 16140 

15 

CAJtCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

2Q AAGATATTTT AAATAAATCA AATGGTCAAA ATAAAACGAA AGATCAAGTT ACTGAAGCGA 16320 

TGAATCAAGT GAATTCTGCT AAAAATAACT TAGATGGTAC GCGTTTATTA GATCAAGCGA 16380 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GACGCATTTA ACAACTGCAC AAAAAACGAA 16440 

25 TTTAACAAAC CAAATTAA7A GTGGTACTAC TG7CGCTGGT GTTCAAACGG TTCAATCAAA 16500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 16620 

30 CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATCCAGAAA TGAATCCAAG 16580 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16740 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 16800 

35 

AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 16860 

TGAT3VCTGTA AAACAAAATG CGCAACATCT AGACCAAGCT ATGGCTAGCT TACAGAATGG 16920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16980 

40 

ACAACAAGAG TATGATAATG CTATTACTGC AGCGAAAGCG ATTTTAAATA AATCGACAGG 17040 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AGCATTACAA CGTGTTAATA ATGCGAAAGA 17100 

45 TGCATTGAAT GGTGATGCAA AATTAATTGC AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCTCAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

SO GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TTAGCGAGCC AAAACTTCTT 17340 

GGATGCTGAT GAGCAAAAAC GTAATGCA7A CAATCAAGCT GTATCAGCAG CCGAAACCAT 174 00 

55 



461 



EP0 786 519 A2 





TGTTAATAAT GCGAAACATG CATTAAATGG TACGCAAAAC TTAAACAATG 


CGAAACAAGC 


17S20 




AGCGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG 


ATGCATTAAA 


17580 


5 


AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC 


ACAATGCGAC 


17640 




TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA 


CGAATACGTT 


17700 




AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA 


CAACTAAAGT 


17760 


10 


TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC 


CTTCAGAAGT 


17820 




AACAGCTGCA GCTAATCAAG TAAACAGCGC GAAACAAGAA TTAAATGGTG 


ACGAAAGATT 


17880 


15 


ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT 


TAAATACACC 


17940 


TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG 


ACGTACAAAC 


18000 




TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG 


ATAGTATTGC 


1B060 


20 


TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA 


ATAACCAATC 


18120 




AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 


18180 




TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 


18240 


25 


AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT 


TAAATACATT 


18300 




ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC 


GTGCAGGTCA 


18360 




TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC 


AAATGGGTAA 


18420 


30 


CTTGGAACAA GCTATCCATG ATCAAAACAC AG7TAAACAA AGTGTTAAAT 


TTACTGATGC 


18480 




AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG 


CAATTCTGAA 


18540 




TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC 


AAAATGTTTC 


18600 


35 


AAGTGCTAAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATGCAGCTAA 


18660 




AAATGCATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAACGTGACT 


TAACAACTAA 


18720 


40 


AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA 


GTACACAATT 


18780 


GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA 


CACTAGCAAG 


13840 




TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG 


CCGTTACGAA 


18900 


45 


CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG 


CCGTTGAAAA 


18960 




CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 


19020 




AGCTAAATCA AATGCAAACA CT AC TAT AAA CGGACTTCAA CATTTAACAA 


CTGCTCAAAA 


19080 


50 


AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG 


ATACTGTTAA 


19140 




ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA 


TACAAGATAA 


19200 
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TAACAATGCT 


GTTGATAGTG 


CTAATGGTGT 


CATTAATGCA ACAAGCAATC 


CAAATATGGA 


19320 




TGCTAATGCA ATTAACCAAA TCGCTACACA AGTGACATCA ACGAAAAATG 


CATTAGATGG 


19380 


5 


TACACATAAT TTAACGCAAG CGAAACAAAC AGCAACAAAT GCCATCGATG GTGCTACTAA 


19440 




CTTAAATAAA 


GCGCAAAAAG 


ATGCGTTAAA AGCACAAGTT 


ACAAGTGCGC 


AACGTGTTGC 


19500 


10 


AAATGTAACA AGTATCCAAC 


AAACTGCAAA 


TGAACTTAAT 


ACAGCTATGG 


GTCAATTACA 


19560 


ACATGGTATT 


GATGATGAAA 


ATGCAACAAA 


ACAAACTCAA 


AAATATCGTG 


ACGcTGAACA 


19620 




AAGTAAGAAA 


ACTGCTTATG 


ATCAAGCTGT 


AGCTGCTGCG 


AAAGCAATTT 


TAAATAAACA 


19680 


15 


AACAGGTTCA AATTCAGATA 


AAGCAGCAGT 


TGACCGTGCA 


TTACAACAAG 


TAACAAGTAC 


19740 


GAAAGATGCA TTGAATGGTG 


ATGCAAAACT 


GGCAGAAGCG 


AAAGCGGCAG 


CTAAACAAAA 


19800 




CTTAGGCACT 


TTAAACCATA 


TTACGAATGC 


ACAACGTACT 


GACTTAGAAG 


GCCAAATCAA 


19860 


20 


TCAAGCGACG 


ACTGTTGATG 


GCGTTAATAC 


TGTAAAAACA AATGCCAATA 


CATTAGACGG 


19920 




CGCAATGAAT 


AGCTTACAAG 


GTTCAATCAA 


TGATAAAGAT 


GCGACATTAA 


GAAATCAAAA 


19930 




TTATCTTGAT 


GCGGATGAAT 


CAAAACGAAA 


TGCATATACG 


CAAGCTGTCA 


CAGCGGCTGA 


20040 


25 


AGGCATTTTA 


AATAAACAAA 


CTGGTGGTAA 


CACATCTAAA 


GCAGACGTTG 


ATAATGCATT 


20100 




AAATGCAGTT 


ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 


20160 




AACTTCAGCA 


ACAAATACGA 


TTGATGGTTT 


ACCTAACTTA ACACAATTAC 


AAAAAGACAA 


20220 


30 


CTTGAAGCAT 


CAAGTTGAaC 


AAGCGCAAAA 


TGTAGCAGGT 


GTAAATGGTG 


TTAAAGATAA 


20280 




AGGTAATACG 


TTAAATACTG 


CCATGGGTGC 


ATTACGTACA 


AGTATCCAAA 


ATGATAATAC 


20340 




GACGAAAACA AGTCAAAATT 


ATCTTGATGC 


ATCTGACAGC 


AACAAAAATA 


ATTACAATAC 


20400 


35 


TGCTGTAAAT 


AATGCAAATG 


GTGTTATTAA 


TGCAACGAAC 


AATCCAAATA 


TGGATGCTAA 


20460 




TGCGATTAAT 

* 


GGCATGGCAA 


ATCAAGTCAA 


TACAACAAAA 


GCAGCGTTAA 


ATGGTGCACA 


20520 


40 


AAACTTAGCT 


CAAGCTAAAA 


CAAATGCGAC 


GAACACAATT 


AACAACGCAC 


ATGACTTAAA 


20580 


CCAAAAAO^A 


AAAGATGCAT 


TAAAAACACA 


AGTTAACAAT 


GCACAACGTG 


TATcTGATGC 


20640 




AAATAACGTT 


CAACACACTG 


CAACTGAATT 


GAACAGTGCG 


ATGACAGCAC 


TTAAAGCAGC 


20700 


45 


TATTGCTGAT 


AAAGAAAGAA 


CAAAAGCAAG 


CGGTAATTAT 


GTCAATGCTG 


ATCAAGAAAA 


20750 




ACGTCAAGCG 


TATGATTCAA 


AAGTGACTAA 


CGCTGAAAAT 


ATCATTAGTG 


GTACACCGAA 


20820 




TGCGACATTA 


ACAGTCAATG 


ACGTAAATAG 


TGCGGCATCA 


CAAGTCAATG 


CGGCTAAAAC 


20830 


SO 


AGCATTAAAT 


GGTGATAACA 


ACTTACGTGT 


AGCGAAAGAG 


CATGCCAACA 


ATACAATTGA 


20940 




CGGCTTAGCA 


CAATTGAATA 


ATGCACAAAA 


AGCAAAATTA 


AAAGAACAAG 


TTCAAAGTGC 


21000 
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GAAAGGCTTA AGAGATAGTA TTGCGAATGA AGCAACAATT AAAGCAGGTC AAAACTACAC 21120 

TGACGCAAGT CCAAATAATC GTAACGAGTA CGACAGTGCA GTTACTGCAG CAAAAGCAAT 21180 

CATTAATCAA ACATCGAACC CAACGATGGA ACCAAATACT ATTACGCAAG TAACATCACA 21240 

AGTGACAACT AAAGAACAGG CATTAAATGG TGCGCGAAAC TTAGCTCAAG CTAAGACAAC 21300 

TGCGAAAAAC AACTTGAATA ACTTAACATC AATTAACAAT GCACAAAAAG ATGCGTTAAC 21360 

GCGTAgcATT GATGGTGCAA CAACAGTAGC TGGTGTAAAT CAAGAAACTG CAAAAGCAAC 21420 

AGAATTAAAT AACGCAATGC ATAGTTTACA AAATGGTATC AATGATGAGA CACAAACAAA 21480 

ACAAACTCAG AAATACCTAG ATGCAGAGCC AAGTAAGAAA TCAGCTTATG ATCAAGCAGT 21540 

AAATGCAGCG AAAGCAATTT TAACAAAAGC TAGTGGTCAA AATGTAGACA AAGCAGCAGT 21600 

TGAACAAGCA TTGCAAAATG TGAACAGTAC GAAGACGGCG TTGAACGGTG ATGCGAAATT 21660 

AAATGAAGCT AAAGCAGCTG CGAAACAAAC GTTAGGTACA TTAACACACA TTAATAATGC 21720 

ACAACGTACA GCGTTAGACA ATGAAATTAC ACAAGCAACA AATGTTGAAG GTGTTAATAC 21780 

AGTTAAAGCC AAAGCGCAAC AATTAGATGG TGCTATGGGT CAATTAGAAA CATCAATT CG 21940 

TGATAAAGAC ACGACGTTAC AAAGTCAAAA TTATCAAGAT GCTGATGATG CTAAACGAAC 219 00 

TGCTTATTCT CAAGCAGTAA ATGCAGCAGC AACTATTTTA AATAAAACAg CTGGCGGTAA 21960 

TACACCTAAA GCAGATGTTG AAAGAG CAAT GCAAGCTGTT ACACAAGCAA ATACTGcATT 22020 

AAACGGTATT CAmAACTTAG ATCGTGCGAA ACArGCTGCT AACACAGCGA TTACAAATGC 22 080 

TTCGGACTTA AATACAAAAC mAAAAGAAGC ATTAAAAgCA CAAGTAACAA GTGCAGGACG 2214 0 

TGTATCTGCA GCAAATGGTG TTGAACATAC TGCGACTGAA TTAAATACTG CGATGACAGC 22200 

TTTAAAGCGT GCCATTGCTG ATAAAGCTGA GACAAAAGCT AGTGGTAACT ATGTCAATGC 22260 

TGATCJCGAAT AAACGTCAAG CATATGATGA AAAAGTTACA GCTGCCGAAA ATATCGTTAG 22320 

TGGTACACCA ACACCAACGT TAACACCAGC AGATGTTACA AATG CAGCAA CGCAAGTAAC 22330 

GAATGCTAAG ACGCAGTTAA ACGGTAATCA TAATTTAGAA GTAGCGAAAC AAAATGCTAA 22440 

CACTGCAATT GATGGTTTAA CTTCTTTAAA TGGTCCGCAA AAAGCAAAAC TTAAAGAACA 22500 

AGTGGGTCAA GCGACGACGT TGCCAAATGT TCAAACTGTT CGTGATAATG CACAAACATT 22560 

AAACACTGCA ATGAAAGGTC TACGAGATAG CATTGCGAAT GAAGCAACGA TTAAAGCAGG 2 2 620 

TCAAAACTAC ACAGATGCAA GTCAAAACAA ACAAACTGAC TACAACAGTG CAGTCACTGC 22680 

AGCAAAAGCA ATCATTGGTC AAACAACTAG TCCATCAATG AATGCGCAAG AAATTAATCA 22740 

AGCGAAAGAC CAAGTGACAG CTAAACAACA AGCGTTAAAC GGTCAAGAAA ACTTAAGAAC 22800 
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AGATGCAGTG 


AAACGTCAAA TCGAAGGTGC 


AACGCATGTT AATGAAGTAA 


CACAAGCACA 


22920 




AAATAATGCG 


GATGCaTTAA ATACAGCTAT 


GACGAACTTG AAAAATGGTA 


TTCAAGATCA 


22980 


5 


GAATACGATT 


AAGCAAGGTG TTAACTTCAC 


TGATGCCGAC GAAGCGAAAC 


GTAATGCATA 


23040 




TACAAATGCA 


GTGACGCAAG CTGAACAAAT 


TTTAAATAAA GCACAAGGTC 


CAAATACTTC 


23100 




AAAAGACGGT 


GTCGAAACTG CGTTAGAaAA 


TGTACAACGT GCTAAAAACG 


AATTGAACGG 


23160 


10 


TAATCAAAAT 


GTTGCGAACG CTAAGACAAC 


TGCGAAAAAT GCATTGAATA 


ACCTAACATC 


23220 




AATTAA7AAT 


GCACAAAAAG AAGCATTGAA ATCACAAATT GAAGGTGCGA 


CAACAGTTGC 


23230 


15 


AGGTGTAAAT 


CAAGTGTCTA CAACGGCATC 


TGAATTAAAT ACAGCAATGA 


GCAACTTACA 


23340 


AAATGGTATT AATGATGAAG CAGCTACAAA AGCAGCGCTT AATGGTACTC AAAACCTTGA 


23400 




AAAAGCTAAA 


CAACACGCAA ATACAGCAAT 


TGACGGTTTA AGCCATTTAA 


CAAATGCACA 


23460 


20 


AAAAGAGGCA TTAAAACAAT TGGTACAACA ATCGACTACT GTTGCAGAAG CACAAGGTAA 


23520 




TGAGCAAAAA 


GCAAACAATG TTGATGCAGC 


AATGGACAAA TTACGTCAAA 


GTATTGCAGA 


23580 




TAATGCGACA 


ACAAAACAAA ACCAAAATTA 


TACTGATGCA AGTCAGAATA 


AAAAGGATGC 


23640 


25 


GTACAATAAT 


GCTGTCACAA CTGCACAAGG 


TATTATTGAT CAAACTACAA 


GTCCAACTTT 


23700 




AGATCCGACT 


GTTATCAATC AAGCTGCTGG 


ACAAGTAAGC ACAACTAAAA 


ATGCATTAAA 


23760 




TGGTAATGAA 


AACCTAGAGG CAGCGAAACA 


ACAAGCGTCA CAATCATTAG 


GTTCATTAGA 


23820 


30 


TAACTTAAAT 


AATGCGCAAA AACAAACAGT 


TACTGATCAA ATTAATGGCG 


CGCATACTGT 


23830 




TGATGAAGCA AATCAAATTA AGCAAAATGC 


GCAAAACTTA AATACAGCGA 


TGGGTAACTT 


23940 


35 


GAAACAAGCG 


ATAGCTGACA AAGATGCTAC 


GAAAGCGACA GTTAACTTCA 


CTGATGCAGA 


24000 


TCAAGCAAAA 


CAACAAGCAT ATAACaCTGC 


TGTTACAAAT GCTGAAAATA 


TCATTTCAAA 


24060 




AGCTAATGGC 

• 


GGCAATGCAA CACAAGCTGA AGTTGAACAA GCAATCAAAC 


AAGTTAATGC 


24120 


40 


TGCAAAACAA 


GCATTAAATG GTAATGCCAA 


CGTTCAACAT GCAAAAGACG 


AAGCAACAGC 


24180 


ATTAATTAAT 


AGCTCTAATG ACCTTAACCA 


AGCACAAAAA GACGCATTAA AACAACAAGT 


24240 




TCAAAATGCA 


ACTACTGTAG CTGGTGTAAA 


CAATGTTAAA CAAACAGCAC 


AAGAGTTAAA 


24300 


45 


CAATGCTATG ACACAATTAA AACAAGGCAT TGCAGATAAA GAACAAACAA AAGCTGATGG 


24360 




TAACTTTG7C 


AATGCAGATC CTGATAAGCA 


AAATGCATAT AATCAAGCAG 


TAGCGAAAGC 


24420 




TGAAGCATTA 


ATTAGTGctA CGCCTGATGT 


TGTCGTTACA CCTAGCGAAA 


TTACTGCAGC 


244S0 


SO 


GTTAAATAAA 


GTTACGCAAG CTAAAAATGA 


TTTAAATGGT AATACAAACT 


TAGCAACGGC 


24540 




GAAACAAAAT 


GTTCAACATG CTATTGATCA ATTGCCAAAC TTAAACCAAG 


CGCAACGTGA 


24600 
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AGCGGCGACA 


ACGCTTAATG 


ACGCGATGAC ACAATTGAAA 


CAAGGTATTG 


CGAATAAAGC 


24720 




ACAAATTAAA 


GGTAGCGAGA 


ACTATCACGA TGCTGATACT 


GACAAGCAAA 


CAG CATATG A 


24780 


5 


TAATGCAGTA ACAAAAGCAG 


AAGAATTGTT AAAACAAACA 


ACAAATCCAA 


CAATGGATCC 


24340 




AAATACAATT 


CAACAAGCAT 


TAACTAAAGT GAATGACACA 


AATCAAGCAC 


TTAACGGTAA 


24900 




TCAAAAATTA 


GCTGATGCCA AACAAGATGC TAAGACAACA 


CTTGGTACAC 


TAGATCATTT 


24960 


10 


AAATGATGCT 


CAAAAACAAG 


CGCTAACAAC TCAAGTTGAA 


CAAGCACCAG 


ATATTGCAAC 


25020 




AGTTAATAAT GTTAAGCAAA ATGCTCAAAA TCTGAATAAT GCTATGACTA ACTTAAACAA 


25080 


15 


TGCATTACAA 


GATAAAACTG 


AGACATTAAA TAGCATTAAC 


TTTACTGATG 


CAGATCAAGC 


2S140 


TAAGAAAGAT GCTTATACTA ATGCGGTTTC ACATGCAGAA GGTATTTTAT 


CTAAAGCAAA 


25200 




TGGCAGCAAT 


GCAAGTCAAA 


CTGAAGTGGA ACAAGCGATG 


CAACGTGTGA ACGAAGCGAA 


25260 


20 


ACAAGCATTG 


AATGGTAATG 


ACAATGTACA ACGTGCAAAA GATGCAGCGA AACAAGTGAT 


25320 




TACAAATGCA 


AATGATTTAA 


ATCAAGCAAT GACACAATTG 


AAACAAGGTA 


TTGCAGATAA 


25330 




AGACCAAACT 


AAAGCAAATG 


GTAACTTTGT CAATGCTGAT 


ACTGATAAGC 


AAAATGCTTA 


25440 


25 


CAACAATGCG 


GTAGCACATG 


CTGAACAAAT AATTAGTGGT 


ACACCAAATG 


CAAACGTGGA 


25500 




TCCACAACAA 


GTGGCTCAAG 


CGTTACAACA AGTGAATCaA 


GCTAAGGGTG 


ATTTAAACGG 


25560 




TAACCATAAC 


TTACAAGTTG 


CTAAAGACAA TGCAAATACA 


GCCATTGATC 


AGTTACCAAA 


25620 


30 


CTTAAATCAA 


CCACAAAAAA 


CAGCATTAAA AGACCAAGTG 


TCGCATGCAG 


AACTTGTTAC 


25680 




AGGTGTTAAT 


GCTATTAAGC 


AAAATGCTGA TGCGTTAAAT 


AATGCAATGG 


GTACATTGAA 


25740 




ACAACAAATT 


CAAGCGAACA 


GTCAAGTACC ACAGTCAGTT 


GACTTTACAC 


AAGCGGATCA 


25800 


35 


AGACAAACAA 


CAAGCATATA 


ACAATGCGGC TAACCAAGCG 


CAACAAATCG 


CAAATGGCAT 


25860 




ACCAACACCT GTATTGACGC CTGATACAGT AACACAAGCA GTGACAACTA 

• 


TGAATCAAGC 


25920 


40 


GAAAGATGCA 


TTAAACGGTG 


ATGAAAAATT AGCACAAGCG 


AAACAAGAAG 


CTTTAGCAAA 


25980 


TCTTGATACG 


TTACGCGATT 


TAAATCAACC ACAACGTGAT 


GCATTACGTA 


ACCAAATCAA 


26040 




TCAAGCACAA 


GCGTTAGCTA 


CAGTTGAACA AACTAAACAA 


AATGCACAAA 


ATGTGAATAC 


26100 


45 


aGCaATGAGT 


AACTTGAAAC 


aAGGTATTGC aAACAAAGAT 


ACTGTCAAAG 


CAAGTGAGAA 


26160 




CTATCATGAT 


GCTGATGCCG 


ATAAGCAAAC AGCATATACA AATGCAGTGT 


CTCAAGCGGA 


26220 




AGGTA7TATC 


AATCAAACGA 


CAAATCCAAC GCTTAACCCA GATGAAATAA 


CACGTGCATT 


26280 


50 


AACTCAAGTG 


ACTGATGCTA AAAATGGCTT AAACGGTGAA 


GCTAAATTGG 


CAACTGAAAA 


26340 




GCAAAATGCT 


AAAGATGCCG 


TAAGTGGGAT GACGCATTTA AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 26580 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 26640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 26700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26760 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 26880 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26940 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 27120 

TCAATTAGAT CATTTGAATA ATGCGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 27240 

GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 27420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 27480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27600 

TACGCTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGOUATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 2 7720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27780 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27 840 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 27960 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 28020 

CAAAGATGCA ACACGTGCGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 23030 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 2 814 0 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAA7A TCATCTAAAA ATGCATTAGA 28200 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 28320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 28380 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 28440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 28500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 28560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 28620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 28740 

CCAAGCAATG GAAGCTTTAC GTAATAGCAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 28800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 28860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 28920 
GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA . 28980 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATAATC CGCAACGTCA 29040 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 2910 0 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 29160 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 2 9220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 2 9280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 2 9340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 2 9400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 2 9460 

GGTTSCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 2 9520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CGTCAACTGA 2 9530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 2 9S40 

TGGTTCAAAT GCGAATAAAG ACG CTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 2 9700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 29760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 2 9820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 29380 

TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 2 9940 

CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TG CAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 30120 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 30180 

CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 30240 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 30300 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 30360 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 30420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 30480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 30540 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAGCAAT 30600 

TAATAGAGCA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 3 06 SO 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 30720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG GTAATGCAAT 30780 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 30840 

TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 30900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 3102 0 

GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 310 8 0 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 
TTAGCGATAG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 
GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 
TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 
ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 



60 
120 
180 
240 
300 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 80 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 54 0 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 660 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 720 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 78 0 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTCt GG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

20 GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 108 0 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

25 TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 1200 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 126 0 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 132 0 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 1380 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 1440 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATGSATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 1680 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 174 0 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 1800 

45 AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 1980 

50 TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 2220 
ATAAAATATT TGAATTAGAA GGC 224 3 

5 (2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

Q (C) STRANDSDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

15 

TTGGnATCAT tyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 60 

CAATGTTTGC AATTTGTTGG GTTGCATATA TTCAATGGGA GTCTACAATC GCTTCATTTA 120 

2Q CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 180 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 240 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 3 00 

25 CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 420 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 480 

30 GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 540 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 600 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

35 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 780 

■ 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 84 0 

40 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

45 ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 1020 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 1080 

TTCAAGATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 114 0 

SO CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 1260 
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GTTATATAAC AAAGGTTTAG CATACGTTGA 
AGGCACTGTT TTATCTAACG AAGAAGTGAT 
5 AGTTTATCGT AAGCCGATGA AACAATGGGT 

ATTAGCAGAT TTAGATGATT TAGATTGGCC 
GATTGGACGT TCTGAAGGGG CCAAAGTTTC 

10 

AGAAGTATTT ACGACTAGAC CAGATACAAT 
TGAACATGCA TTAGTTAATT CAATTACAAC 
TCAAACAGAA GCTTCTAAAA AGTCAGATTT 

75 

AGGTGTATTT ACTGGTGCAT ATGCAACTAA 
GATTGCTGAT TATGTATTAT CAACATATGG 
TGATGACAGA GATTATGAAT TTGCTAAAAA 

20 

AGGTGGAAAT GTTGAAGAAG CAGCATACAC 
ACTTGATGGT TTAGAAAATG AAGCGGCAAT 

25 AGGTGCTGGC GAAAAGAAAG TTAATTACAA 

TTATTGGGGC GAACCAATTC CTGTCATTCA 
TGAAGAAGAG CTACCATTGT TGTTACCTGA 

30 TGAGTCTCCA CTAGCTAATA TTGATTCATT 

GAAAGGACGT CGTGAAACAA ATACAATGCC 
ACGTTACATC GATCCTAAAA ATGAAAATAT 

35 

GTTACCTGTT GATTTATATA TCGGTGGAGT 
AAGATTTTGG CATAAAGTCC TTTATGATTT 
AAAATTATTT AACCAAGGTA TGATTTTAGG 

40 

AGGAAATGTA ATCAATCCTG ATGATATAGT 
TTACGAAATG TTTATGGGAC CTTTAGATGC 
TGGGTCTCGT CGATTCTTAG ATCGCGTATG 

4o 

GAGTTCAAAA ATTGTAACTA CAAA7AATAA 
TAAAAAGGTA ACAGAAGACT TTGAAACATT 
SO GGTATTTATT AATGAGTGTT ATAAAGTTGA 
CGTTAAAATG TTAGCACCTA TTGCACCACA 

55 



TGAAGTTGCA GTTAACTGGT GTCCAGCATT 1380 

TGATGGTGTC TCTGAACGTG GTGGACATCC 144 0 

ACTTAAAATC ACAGAATATG CAGATCAATT 150 0 

TGAGTCTTTA AAAGATATGC AGCGCAATTG 1560 

ATTTGATGTA GATAA7ACGG AAGGAAAAGT 1620 

CTATGGTGCA TCATTCTTAG TCTTAAGTCC 16 90 

AGATGAATAT AAAGAAAAAG TAAAAGCTTA 1740 

AGAACGTACA GATTTAGCAA AAGATAAATC 1800 

TCCTTTATCT GGTGAAAAAG TACAAATTTG 1860 

TACTGGAGCA ATTATGGCAG TACCAGCGCA 1920 

GTTTGATTTG CCAATCATTG AAGTCATCGA 1980 

TGGTGAAGGT AAACATATTA ATTCTGGTGA 2040 

TACTAAAGCT ATTCAATTAT TAGAGCAAAA 2100 

ATTAAGAGAT TGGTTATTCA GTCGTCAGCG 2160 

TTGGGAAGAT GGAACAATGA CAACTGTTCC 2220 

AACAGATGAA ATCAAGCCAT CAGGGACTGG 2280 

TGTAAATGTT GTAGATGAAA AAACAGGTAT 2340 

ACAATGGGCA GGTAGTTGTT GGTATTATTT 24 00 

GTTAGCAGAT CCTGAAAAAT TAAAACATTG 2460 

AGAACATGCG GTTCTTCACT TATTATATGC 252 0 

GGCTATCGTA CCTACTAAAG AACCTTTCCA 2580 

AGAAGGTAAT GAGAAGATGA GTAAATCTAA 264 0 

ACAGTCTCAT GGTGCAGATA CTTTGCGTCT 2700 

TGCAATTGCA TGGAGTGAAA AAGGATTAGA 276 0 

GCGTTTAATG GTAAATGAAG ATGGGACATT 2 820 

ATCTTTAGAT AAAGTTTATA ACCAAACTGT 2 38 0 

AGGATTTAAT ACTGCTATTA GTCAATTAAT 2 940 

TGAAGTTTAT AAACCTTACA TTGAAGGCTT 3 000 

TATCGGTGAA GAATTATGGT CAAAATTAGG 3060 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 3240 

GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 3300 

TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3360 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 3420 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 3480 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 3 540 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3 600 

CATGCACGCA TGGGGCGA7G AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3660 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 3780 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3840 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3960 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4020 

TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTA? TACTTTCAAA 4 080 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA CTAATCAAGC 4140 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4 200 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG CATTGGTTAA 42S0 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 43 20 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 43 80 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGTACCAA CTAAAACAAA 4440 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4 500 

TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 560 

AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC 4 620 

GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA CAACTGATCC 4 6 80 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 4740 

50 TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4 800 

CTTCACAC7A ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 860 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6730 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6840 

5 TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6960 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCGAAGCA 7020 

10 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7030 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 714 0 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

15 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 72 60 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 73 80 

20 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATTAGCTGAA ATCAATGCTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 76 80 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 7740 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7860 

3 ° GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 792 0 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

■ 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8009 

40 

(2) INFORMATION FOR SEQ ID NO: 62: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 
'(C) STRAND EDNESS ; double 
(D) TOPOLOGY: linear 

SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 60 
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AGATGAATGC 


TAACCATATT 


CATTCTGCTA AAGATGGTCG 


TGTTACTGCG 


ACAGCTGAAA 


180 




TTATTCATCG 


AGGTAAGTCG 


ACACATGTAT 


GGGATATAAA 


AA7TAAGAAT 


GACAAAGAAC 


240 


5 


AATTAATTAC 


AGTTATGCGT 


GGTACAGTTG 


CTATTAAACC 


TTTAAAATAA 


AAGAACTGCT 


300 




AGCTGAAATG 


TTATGAGATA 


TTCATAACTA 


CGGCTAGCAG 


TTTTTTTATG 


CGCTATATTG 


360 




TTGTAGTTTT 


AGAAATGCTT 


GTTCAATGCG 


TTCGGCAGCT 


TTACGGCCAC 


CCATAACATT 


420 


10 


TCTACCAAAT 


GGTCCTAATT 


CTAAGTCTGC 


AAAGCATCCT 


GCGACAAATA 


GATTTGGTAT 


480 




CCATTCTAAT 


TTTTCGGAAA 


TAACAGGGTA 


ATTACATTCG 


TTGATAGGTG 


CATCATAATT 


540 


15 


TTGTATTAAT 


TGCTTAATAA 


GTGGTTGTGA 


CATAAAATCT 


TGTTCAAAAC 


CAGTTGCAAC 


600 


CATAATCTGT 


TGATATGGAA 


CAGAATCATT 


TTCAGTGTTA ATTACACCAC 


CACTAATTTG 


660 




AGTGATAGGT 


GTTTTATGCa 


CATTTATACG ACCATTTTTA ATATGTTTTT 


TAAGGCGTAA 


720 


20 


GTACAGTTCG 


TGAGGCATTG 


ATCCTTTATG 


ACGTTCGCGT 


TGTACAATGG 




780 




AGGCATGCTT 


TTAGTACTTA 


AAAATGAAGA 


CATATTTTTC 


GGACCTAACC 


AACCAGGATC 


840 




AGCATCAAAG 


TCATGTATTT 


CAATATCTTT 


ATTTAGCCAT 


AAATGAATCT 


TTTTATCGTT 


900 


25 


ATCATGATTT 


AACAATTTAA 


GTGCAAGATG 


TGCAGCAGTa 


ATGCCGCTAC 


CAACGATATG 


960 




ATCGGTCTTA 


TCATATACTA 


CTTGATCAAG 


TTCTTTCTCG 


AAGATATGAT 


TTACATTCTG 


1020 




TTTGTCTTTT 


AAAATGTCAG 


GCATAAACGG 


AATATTTGTA 


CTGCCTATTG 


CAATAACGAC 


1080 


30 


GCAATCTGTA 


GTGATAATTT 


GTCCATCTTC 


TAACTTGATA 


TGCCATTTGT 


CTTCTTGTTT 


1140 




ATCTAAAGTT 


TGAACTAAAC 


CTTGAACCAA 


GCAATCCTCT 


AATTGATATT 


GTTTAGAAGC 


1200 




ATGTGCAATA 


TGATCCATAA 


ACATTGTCAA 


TTCAGGTCGT 


TGATAAGGAC 


CATAAAAAGC 


1260 


35 


ATTTGTATAT 


TGGTGCTGTT 


TAGCGAATTG 


TTTTAGATGG 


AACGGTTGTG 


GATGTACGTG 


1320 




ATGTACAATC 

* 


GG7GATCTTA 


AATAAGGCAT 


TTCTATTCGA 


TTTGTATATG 


AGTTAAACCT 


1380 


40 


TTGGCAAAAA 


GTTTCGTGTG 


GGTCAATGAT 


TGTTAATCGG 


TCTGTTGTTA 


ATCCGCTTGA 


1440 


TAATAGTTTT 


TGTGCGATTG 


CAGTTCCCTG TATGCCACCG 


CCGATAATTG 


TCCAATGCAT 


1500 




AATAAAACCT 


CTCTCTTTTT 


AAAACGTAAT 


AGTTACGATT 


TATAATTATT 


ATTATCATAA 


1560 


45 


TACATAACGA 


CATGAAAGGC 


AATTAAATTA 


AAGAGATATA 


TGTAGATAGG 


GCGAATCTGT 


1620 




AGTCAAAGAA 


AAAATCATTG 


AAAAAGAGGT 


AACAATGTCA 


AAAGAwAACA 


GCAGTAAAAT 


1680 




CATTCCTAAT 


TTGGAATCAT 


CTTACTGCTG 


TTTGTTGTTG 


ATITATATTC 


ATGATTTTGT 


1740 


SO 


TATATAATCT 


ACAATTTTGT 


GTCTTTTAAG 


TCTTCCGAAA 


TTTCATCGAC 


TTTAGTCTTT 


1800 




TTAGTATAAG 


GCGTTTTAAT 


ATTATATGCT 


GCTTTCATAA 


TCATATGACT 


TGAAAGAGGA 


1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 
CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 
GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 
ATCATGTTCA ATCACCTTAC CTTTGTCCAT AAATTTAGAG AATACTGCAG TACCTAAAAA 
AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 
GAATAATGCT ATAACTGCCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 
GGCAAGTGAT GGGCCTAGCA CAACGCGAAT GAGCATAGCT AACATAGAAA TGACAACTAT 



1980 
2040 
2100 
2160 
2220 
2280 
2340 



GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 
ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 
GCATGAATAT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 
AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 
ACAAAGAATC CTGGTTCATT TTTAAT CG AA GGTTTAATAA TAATTTTCAA AACATCAAAA 
rTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 
&TGACATAAA ATCTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 
.VTGAAACCTA ACACAAAGTT ATTTGTTGTG TAACTATTTG TCACAAACAA CCAAAACACT 
3CGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCC7A ATACAGCTTT 
&ACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 
HTCTGCTGAC AATCCATATA AAA CAGTT AT CACAACTGCA ACGATTGCAA TCG7AGTTAA 
VTATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGACCGA AAAAGCCTTG 
rAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 
\CTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 
\CTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 
V3GATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA GCAGTGCCTG TAAT TTTAAT 
IATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 
\ATAGCCCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCGACTAAGA TCACACCTAC 
VGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG GCATATGCAA CAGCACCGAC 
VCAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 
OTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 
^AAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTA7GCACT AGGTAACCAA 



2400 
2460 
2520 
2S80 
2640 
2700 
27S0 
2820 
2830 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3430 
3540 
3600 
3660 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGC? TATATCTGCT 3780 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 384 0 

GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 390 0 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3960 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 4020 

TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4080 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 4140 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC .4260 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4320 

AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 4380 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 4440 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4500 

25 TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4 560 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4 620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 4 680 

30 TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 4 74 0 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4 800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4 8S0 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4 920 

TTTTACGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 49 30 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 5040 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5290 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 534 0 

SO CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 5400 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 5460 
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ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT AAATAATCTT 5580 

GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT ACAAACTTCG 5640 

AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA GAATATAGTG 5700 

ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG TGAATAATCG 5760 

GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGGAA TAACATTTGC ACTTCTGTTG 5820 

TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC CCAATAACTA 5880 

AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 5940 

GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT AATCGAAGag 6000 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA TACAATTACT 6060 

AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG TGCTTTTTTA 6120 

GGTAATTGTT CAGGTTTATA TTGTCCGAAA AATATATGCA TTATAAATTT AATTGAATAT 6180 

ACAAATGTGA AGACACTGCC CACTATACCA ATGATTGGGA ATAGGTAGCC TAATGTATCA 6240 

ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA TTCTTTTGAT 6300 

25 AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATAA CAGTGATTGT 6360 

AAATGAAATA GGCATAATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC GTGTACCAGT 6420 

AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATGTTG CATGGTTGAT 6480 

00 TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTGCTATCAT CGCCTTGATA 6S40 

GTGATAACTA ATGGCACCGA TTCCAAGCAT CGCCATAATC ATACCTAATT GGGATACTGT 6600 

TGAAAATGCC AGTATACCTT TCAAGTCTTG TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 6660 

TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG CTGCGAAGAT 6720 

TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG AATGAAGATA 6780 

AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA ATGGAAACTG 684 0 

AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA AGAATGGGCT 6900 

ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT GTGTTGGTAT 696 0 

AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA TTATGAGCGA 7020 

TTTTTGAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA GTAAAAAACT 7030 

AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG AAAGTACGAC 714 0 

SO ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA GTTGTTCTGA 7200 

CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTG AAATAAGCAA 7260 
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AACATATCAA GGTGCGTGTA 


CTGGTATTCA 


ACCATACGGT 


GCGTTTGTTG 


AGACCCCTAA 


9180 




TCATACTGAA GGACTGATTC ATATATCAGA AATTATGGAT 


GACTACGTTC 


ATAATTTGAA 


9240 


5 


GAAATTTCTA 


TCAGAAGGCC 


AAATTGTTAA AGCTAAAATT 


TTGTCTATAG 


ATGATGAAGG 


9300 




AAAGCTTAAT 


CTATCATTAA AGGATAATGA 


TTACTTCAAA 


AATTATGAGC 


GTAAGAAGGA 


9360 




AAAACAATCA GTATTAGATG AAATCAGAGA AACAGAAAAA TATGGGTTTC AAACACTTAA 


9420 


10 


AGAACGCTTA CCAATCTGGA TAAAACAGTC AAAGCGAGCA ATTCGAAACG ACTAAAGGAA 


9480 




CAGATAAATC 


GTACCGAAAA 


TCATACAAAG 


GGTCTGAAAT 


GAAAGTTTCT 


TAGACTATAA 


9540 


15 


AAGAGATTAG 


TATCTATTAA 


ATTTTATTAG 


ATACTAATCT 


CTTTTTGTCT 


ACGATAACGT 


9600 


AATATGaTTG 


ATTCTATTTA 


CACGTACAAA 


TGGTTTAAGG 


TGACATATCC 


ATTATCTTTG 


9660 




TTAGATAGAA 


TCGTTGATTT 


GCaATATTGT 


ATGTGGATTT 


GTTTTTTTTA 


TTTATTTTAG 


9720 


20 


AAATGAGAAC 


TACAACTTAA AGTATTAAAC 


GAATTGCAAC 


TATATAAACA 


GATAATTGGA 


9780 




GAATGAAAAA 


ATTACATGTT 


ATAGTCAACT 


CAATAATTTT 


AAGGAGGAAT 


TAAGTAATGA 


9840 




AAAGTAAATA 


CGAACCATTG 


TTTGATAAAG 


TAGAATTACC 


AAATGGAG7A 


GAGTTGAGAA 


9900 


25 


ATCGATTTGT 


GTTAGCCCCT 


TTAACACATA 


TTTCTTCAAA 


TGATGATGGT 


ACTATTTCAG 


9960 




ATGTAGAACT 


TCCTTATATT 


GAAAAGCGTT 


CACAAGATGT 


TGGTATTACA 


ATTAATGCTG 


10020 




CGAGTAATGT 


GAGTGATGTC 


GGAAAAGCAT 


TTCCAGGACA 


GCCATCAATC 


GCGCATGACA 


10080 


30 


GTAATATTGA 


AGGACTAAAA 


CGATTAGCTA 


CAGCAATGAA 


GAAAAACGGT 


GCCAAAGCAC 


10140 




TCGTACAAAT 


ACATCATGGC 


GGTGCACAAG 


CATTGCCTGA 


ATTAACACCT 


GATGGAGACG 


10200 




TCGTAGCACC 


AAGTCCAATT 


TCTTTAAAAA 


GTTTTGGTCA 


GAAACAAGAA 


CATAGTGCTA 


10260 


35 


GAGAAATGAC 


GAATGAAGAG 


ATTGAACAAG 


CAATCAAGGA 


TTTTGGTGAA 


GCAACGCGAC 


10320 




GTGCAATTGA 

* 


AGCAGGGTTT 


GATGGTGTTG 


AAATACATGG 


CGCGAATCAT 


TACTTAATTC 


10380 


40 


ATCAATTTGT 


ATCACCATAC 


TATAATAGAA 


GAAATGATGT 


ATGGGCAAAT 


CAATATAAAT 


10440 


TCCCGGTCGC 


TGTGATTGAA 


GAAGTACTTA 


AAGCGAAAGA 


AGCGTATGGC 


AATAAAGACT 


10500 




TTATAGTTGG 


ATACAGATTA TCTCCAGAGG 


AAGCGGAGTC 


TCCAGGAATC 


ACAATGGAAA 


10560 


45 


TTACAGAGGA 


ACTCGTTAAT 


AAAATTAGCC 


ATATGCCAAT 


CGACTATATT 


CATGTTTCAA 


10620 




7GATGGATAC 


GCATGCAACG 


ACACGTGAAG 


GTAAATACGC 


TGGACAAGAA AGACTGCCTT 


10680 




TAATTCACAA 


ATGGATAAAT 


GGTCGTATGC 


CACTTATCGG 


TATTGGTTCA 


ATTTTCACAG 


10740 


SO 


CTGACGAAGC 


TTTAGATGCA 


GTTGAAAATG 


TTGGTGTTGA 


CTTAGTAGCC 


ATTGGTAGAG 


10800 




AGCTACTACT 


GGATTATCAA 


TTTGTTGAAA 


AAATTAAAGA 


TGGACGGGAA 


GATGAAATTA 


10860 



55 



481 



EP0 786 519 A2 



AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 



10953 



(2) INFORiMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE SS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 
GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 
GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 
AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 
TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 
CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 
CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 
CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 
AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 
CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 
CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 
CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 
AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 
TACCAAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 
AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 
TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 
CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 
CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 
CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 
GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 
ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 
GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 



60 
120 
180 
24 0 
300 
360 
420 
480 
54 0 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 144 0 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 1500 

TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 1560 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 1620 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 1680 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 1740 

TGCGTATGcA ilTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATACACTGG TACTCAATAT 1860 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 1980 

GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 204 0 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 2100 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 2150 

25 TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 2280 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

30 TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 24 00 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 2520 

AAATGCTAGC AATACTGGTG AAGGCAGTAT TAATACGACA TTAACATTTG ATGATCCTGC 2580 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 2640 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 2760 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2820 

45 AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2 880 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2940 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3000 

SO AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGC7GAT GCTTCAAGAA TTACAACAAA 3060 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 3120 
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_ TAATACTGAA ATCGGTAACA ATGGTAATTT 
ATATGAAGTA ACATTACCAC AAGGTGTAAC 

5 CCCTAATGGT AATGAAGACA GTACAGTATT 

TGCAAATAAA GTTACATTTA CAAGCCAAGG 
AGAAGTTTTA TTCCCAGATA AATCTTTAAA 

10 CGATACACCT AAAAATATTG ATTTTAATGA 

TGTAATTAAT AATGCGCAAC CAGAAGTaCA 
GAAATGAACA AAGATGCGTT GCAACAACAA 

15 

ACAACAGCAT CAATTGCAGA ATACAATAAA 
GAAGATGCGA ATCATGTTAA AACTGCAAAT 
GTAACTAAAT TACAAGCTGC ATTAATTGAT 

20 

AAAGCTCAAG AAAAGGTTAC AGCAGCACAA 
GCAGCACTTG TAACTAAAAT TAACAATGAT 

25 CAAACTACAG CACAAGGTGT CACAACTGAA 

GATGTGATTA CACCAACAGT TAAACCTCAA 
ACTCGTAAAC AACAAATTAA AAAGTCAAAT 

30 AATGATAAAA TTGGTAAAAT TGAAACAAAG 

AATGCACAAG TAGAAGCCAT TAAAACAAAA 
GCTACAACAG CTAAAGCAGC AGCTCTTGAA 

3 ° GATCAAGCAC CTTTAAATCC TGATACAACA 

ATTAATGCAG CTAAAGTTTC TGGTGTTAAA 
TTAGAAAGAG TTAAAAACGA AGAAATCTCA 

40 

ACAAAAATGG ATGCCTATAA TGAAGTTAAA 
GCTACAGTTT CAAATGCAAC AAATGAAGAA 
GCTCAAAAGC AAGGTTTACA TGACATCCAA 

45 

ACAAAATCAA AAGTATTAGA TAAAATCAAT 
GCAGCTGATA CGGAAGTAGA AAACGCATAT 
50 AATGCTTCAA CTACAGAAGA AAAACAAGCT 

GAAGCAAGAA CAAATCTTGA TGCTGCAAAT 
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TGGTGCTTCA TTAAAAGCAG ATCAATTTAA 3240 

TTACGTTAAT AATTCATTAA CTACAACATT 33 00 

GAAAAATATG ACTGTTAATT ATGATCAAAA .3360 

TGTGACAACG GCACGTGGTA CACACACTAA 3420 

ATTATCATAT AAAGTTAATG TTGCGAATAT 3480 

AAAATTAACA TATCGTACTG CTTCAGATGT 354 0 

CTAACTGCAG A7CCATTTTC AGTAGCGGTT 360 0 

GTAAACTCAC AAGTTGATAA TAGTCATTAC 3660 

CTTAAACAAC AAGCAGATAC TATTTTAAAT 3720 

CGTGCATCTC AAGCGGATAT TGATGGTTTA 3780 

AATCAAGCAG CAATTGCTGA ATTAGATACT 384 0 

CAAAGTAAAA AAGTTACGCA AGATGAAGTT 3900 

AAAAATAATG CAATCGCAGA AATTAATAAA 3960 

AAAGATAATG GTATCGCAGT GTTAGAACAA 4020 

GCGAAACAAG AT ATT AT CCA AG CAGTTACA 4080 

GCATCATTAC AAGATGAAAA AGATGTAGCA 4140 

GCAATTAAAG ATATTGATGC AGCAACAACA 4 200 

GCAATCAATG ATATTAATCA AACTACACCT 4260 

GAATTTGACG AAGTTGTTCA AGCACAAATT 4320 

AATGAAGAAG TAGCGGAAgC TATTGAACGT 4380 

GCAATTGAAG CGACAACGAC TGCACAAGAT 4440 

AAAATTGAAA ATATTACTGA CTCTACGCAA 4 500 

CAAGCTGCAA CAG CT AG AAA AGCTCAAAAT 4560 

GTAGCAGAAG CTGATGCAGC AGTAGATGCA 4620 

GTTGTTAAAT CAAAACAGGA AGTTGCTGAT 4 6SG 

GCAATTCAAA CACAAGCAAA AGTTAAACCT 474 0 

AATACACGTA AACAAGAAAT TCAAAATAGC 4800 

GCATATACAG AATTAGATAC TAAAAAGCAA 48 60 

ACAAACAGTG ATGTAACAAC AGCTAAAGAC 4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 510 0 

GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCAGATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 5280 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 5340 

GATGCAGCAC ATACAAA7GC GGAAGTTGAA GCGGCTAAAA AAGCAGGAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA S460 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5520 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 564 0 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5760 

2S GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 582 0 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGCAGCGA TTAATGCGGT AACACCAAAA 58 3 0 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 594 0 

30 GTTATCAATA ATG AT CAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 6180 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 6300 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6360 

GCGAAAGATG AATTAG CAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 6480 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 654 0 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 660 0 

SO GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6720 
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AAAGTTCAAC AACTTCATGC AAATCCTGTT AAGAAACCAG CAGGTAAAAA AGAATTAGAT 6 840 

CAAGCTGCAG CTGATAAGAA AACACAAATA GAACAAACAC CAAATGCATC ACAACAAGAA 6900 

5 ATTAATGATG CAAAACAAGA AGTTGATACT GAATTAAATC AAGCGAAAAC AAATGTCGAT 6960 

CAATCATCAA CAAATGAATA TGTTGATAAT GCAGTTAAAG AAGGAAAAGC TAAAATTAAT 702 0 

GCAGTTAAAA CATTTAGTGA GTACAAAAAA GATGCTTTAG CTAAAATTGA AGATGCATAT 7080 

10 AATGCTAAAG TAAACGAAGC GGATAACTCT AACGCATCGA CTTCAAGTGA AATTGCTGAA 714 0 

GCGAAACAAA AACTTGCTGA ATTAAAACAA ACTGCGGATC AAAATGTTAA TCAAGCTACT 7200 

TCTAAAGATG ACATTGAAGT TCAAATTCAT AATGACTTAG ATAATATTAA CGATTACACA 7260 

75 

ATTCCAACAG GTAAAAAAGA ATCAGCTACA ACAGATTTAT ATGCTTATGC AGATCAGAAG 732 0 

AAAAATAATA TTTCAGCTGA CACTAATGCA ACACAAGATG AAAAGCAACA AGCAATTAAG 7380 

CAAGTTGACC AAAATGTTCA AACTGCATTA GAAAGCATTA ATAATGGTGT GGATAATGGT 744 0 

20 

GACGTTGATG ATGCATTAAC ACAAGGTAAA GCAGCAATTG ATGCTATTCA AGTAGATGCT 750 0 

ACTGTTAAAC CTAAAGCGAA CCAAGCTATT GAAGTTAAAG CAGAAGATAC GAAAGAATCT 7560 

2$ ATTGATCAAA GTGACCAGTT AACTGCTGAA GAAAAAACTG AAGCATTAGC AATGATTAAA 7620 

CAAATTACAG ATCAAGCTAA ACAAGGTATT ACTGATGCAA CAACAACTGC TGAAGTTGAA 7630 

AAAGCGAAAg cTCaAGGACT TGAAGCATTT GATAACATTC AAATCGACTC AACAGAAAAA 7740 

30 CAAAAAGCTA TCGAAGAATT AGAAACTGCA CTAGACCAGA TTGAAGCAGG TGTAAATGTC 7800 

AACGCTGATG CTACAACTGA AGAAAAAGAA GCGTTTACGA ATGCTTTAGA AGACATTTTA 7860 

TCAAAAGCAA CTGaAGATAT TTCTGATCAA ACTACAAATG CAGAAATCGC TACTGTCAAA 7920 

55 AATAGTGCGC TTGAACAACT TAAAGCACAA CGTATTAATC CTGAAGTTAA GAAAAATGCT 7980 

TTGGXAGCAA TCAGAGAAGT GGTTAACAAG CAAATAGGAA tAATTAAAAA TGCAGATGCA 804 0 

GATGCATCGG CGGAAAGAnA TTGCACGTAC GGGATTTAGG TAGATATTTT GGACCGATTT 8 ICO 

40 

GCTGGATAAA TTTAGGGTnA AACCCCAACC AATGCCGAAG TTGCCTGAAT TACCA 8155 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1530 bass pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

5 AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 240 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 360 

10 CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 420 

CACCTCCAAC GCCAAGTA7G ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 480 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

75 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

20 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 780 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 840 

25 CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 1020 

30 GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 10 30 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 1140 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 1200 

3= TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 12 60 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 1320 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

40 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CATCATATTC TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA T7CTTTAGCT 1500 

AA7TTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

45 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 1630 

SO (2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 732 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 65: 

CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 60 

10 CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 120 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 180 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 240 

75 CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 300 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 360 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 420 

CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 4 80 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 54 0 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCATCTGAT 600 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 660 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 720 

TACCACAACC CG 732 
(2) INFORMATION FOR SEQ ID NO: 66: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

AATATATTCA TATGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 6 0 

AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 12 0 

GCTTTTATTT AATTCAAGGT ATGTATCATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 

TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 300 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 360 

ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 42 0 
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CAACTTTATA CATTAAAATA ATATCATAAT AAGGATAAAA AATAATAGAT ATTGATTTTA 540 

GGGAGATAGT AATGAAAAAA TTGGTTTCAA TTGTTGGCGC AACATTATTG TTAGCTGGAT 600 

GTGGATCACA AAATTTAGCA CCATTAGAAG AnAAAACAAC AGATTTAAGA GAAGATAATC 660 

ATCAACTCAA ACTAGATATT CAAGAACTTA ATCAACAAAT TAGTGATTCT AAATCTAAAA 720 

TTAAAGGGCT TGAAAAGGAT AAAGAAAACA GTAAAAAAAC TGCATCTAAT AATACGAAAA 780 

TTAAATTGAT GAATGTTACA TCAACATACT ACGACAAAGT TGCTAAAGCT TTGAAATCCT 840 

ATAACGATAT TGAGAAAGAT GTAAGTAAAA ACAAAGGCGA TAAGAATGTT CAATCGAAAT 900 

TAAATCAAAT TTCTAATGAT ATTCAAAGTG CTCACACTTC ATACAAAGAT GCTATCGATG 960 

GTTTATCACT TAGTGATGAT GATAAAAAAA CGTCTAAAAA TATCGATAAA TTAAACTCTG 1020 

ATTTGAATCA TGCATTTGAT GATATTAAAA ATGGCTATCA AAATAAAGAT AAAAAACAAC 1080 

TTACAAAAGG ACAACAAGCG TTGTCAAAAT TAAACTTAAA TGCAAAATCA TGATAGGAGT 1140 

CTTTTAATGC GTAATATAAT ATTTTATCTT GTACTTATTA TTGCTGCGAT TGGATTAGTA 12 00 

ATGAATCTAG ATGCCTTTAT TTTTTCAATC GTCAGAATGT TAATCAGCTT TGcgTAaTAG 1260 

25 CTGGTATTAT TTATCTGATT TATTATTTCT TCATCTTAAC TGAAGACCAA CGCAAATATC 1320 

GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT CAAAGAAGAA AATAGATAAA AAAACGGAAG 138 0 

CACTTGTAGG TAAAATAGTC TACGTGCTTC CATTTTTTAT TCTAAAAACT ACTTTCTAAA 1440 

CATCCATTCA TCTGAACGAT ATTTTTCAGT TAATTCTTCC ACTTCTGCCA ATTGAGCTTC 1500 

TGCTAATTCA AGTGGCTTTA ATTCTATATT TAAACCTTTC TTAAAACCTT TCTCGAAAGC 1560 

TTC7TCCATT TGACTAATAG TAATGTGTTC ATCTGAAATA TCATTGATGG CAACTGCTTT 1620 

TTCAACGAAT GCCTCTTTCA TTTTTAATTT TAA7CTCTCA TTTTTATAAA TrAACATATC 16 80 

AAACAGTTCA TCAATATCAA TATCTTGTAA AATCGAACCG TGTTGGAGGA TTACGCCCTT 1740 

TTGTCTCGTT TGAGCACTCC CAGCAATCTT ACGGCCTTCA ACAACTAGCT CATACCAACT 1800 

TGGTGCATCA AAACACACTG AACTTCGAGG TTGTTTTAAT TTTTGACGCT CTTCAGGCGT I860 

TTTAGGTACC GCAAAATAAG TATCAAATCC TAAGTTTTTA AATCCTTCTA ATAATCCTTG 1920 

45 TGAAATCACT CTGTACGC7T CTGTAACTGT AGAAGGCATA TTCGGATGCG ATTCAGGCAC 19 80 

AATCACACTG TAAGTTAACT CTTTATCATG TAGCACCCCA CGGCCACCAG TTTGACGCCT 2040 

TACGAGACCA AAACCTTTCT CTTTAACCTT ATCAATATCA ATTTCTTTTT GTAGCCTTTG 2100 

GAAATACCCT ATTGATAATG TTGCAGGATT CCATGTGTAA AAACGTATAA CTGGATCAAT 2160 

TTCACCTCTA GAGACAAAAT TTAATAACGC TTCATCCATT GCCATATTAT AATATGGGTC 2220 
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TO 



75 



20 
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30 



35 



AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 
TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 
TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 
TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 
AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 
CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 
ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 
CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 
CGGCTATAAA AAATGGAGTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 
TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 
TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 
CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 
AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 
TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 
TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 
TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 
ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 
ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 
TCAATCA7CA TACCTTCTTC AACATTTAAT GGGAAGTATA TTGTTGGTGG ATGTACACCG 



AATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 
CACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 
AACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 
TTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 
ATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 
CTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 
CTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 
.CAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 
CATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 
TTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGGcACA 4140 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 4200 

GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4260 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 4380 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 4440 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4500 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4560 

ATATCACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 4680 

■ 

TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 474 0 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4300 

TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT AC CATCTAAT ACTTCAAAAC 48S0 

25 CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAGCATGTTC TATATTTTGA ACTGCAATAT 4 920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 5040 

GTAATGTTAA TACAAAGCCA CGATTACCTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 5150 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 5220 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 52 80 

TATCTTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 5340 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 5400 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TTAGCATAAG 5460 

TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 5520 

45 TTGTTTGACT AAATGCTAAG ATACATG CTT CAGCAAAGCT AGT CATC CCA TCATACATAG 5580 

AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 5640 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 570 0 

ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCA7AAACA CCAGCACCCA 5760 

rAAATGATGT ATGCGTTTCT TTAGTGATAT CCTTGCTkGC AATGGGGATT TAAACnTCTA 5820 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATnATAATTG GCTTTGCTAA TAATTACTTC CCTGAATTAC aAGTATTAGC AAACGAAATA 60 

AAATCTGATA TGGCTAGTTC ATTAAAACAA TGATATTTTT ATTTAAATTT TTaAAGCTTT 120 

15 

GTACGAAATT GTACAAAGCT TTTTTGGTGC GTATTGTATG GGCAACAACT TGACGATGAA 180 

AATCCGTTAC AGGATTGGTA ATAGGAAATG TTAGCGAAAG ACAAGGGTAT CCATTGTAGA 24 0 

20 TTAACAAAAG GACGTTTCCA CAAGTGTGGG TTATTCTCAC TAAAGCAATA CGCAGAGACA 30 0 

ACTTACGTAA AATTTTGAAC TGACTAGAAC GGAACTTCTA CTCAATTATT GATAAAAATT 360 

TTCAAAAAGA CTTGAATGTG CTGAGAATAC GAAGTTTATG GAAGGATTAT CAAAATATAA 4 20 

25 ATGTGCATTC ATTTACAACC TTTATTGACA ATGATTCTCA ACTAATATAG TATATAATCA 480 

AATCGTAATA GTTACGATTT GTTTTCTGCA ACTTTTTTGA AGTTTTAGTT GAGGTGAAAA 540 

CAATAAAAGC ATCTAAGTGA ATGTAGTTAA CGGACAACTG CATTCGCTTG TAGAGCCACA 600 

30 

AGAAGCAACT TTAAATAAGG TTTACGGTTG CATTTTGATA CAACAACCGA TTACTAAGTC 660 

ATGCTTTCCA CTTTGCGGGT TAGCATGACT TACCTAATAG ATAGAGCTAT TAGGTTCAGC 720 

TTCTAAAAAA TTACAGTTTT AGAGGAATAC AGTTGcTTGc tTCGCAACAA CTGCATAAGA 78 0 

35 

GCCATGGTTT TCGCTTTTGC GAATTAGCAT GACTTACCTA CTAGATAGAG CTATTAGGTT 84 0 

CATCTTCTAA AAAATTACAG GTTTAGAGGA ATACAGTTGT TTGcTTCGCA ACAACTGCAT 900 

4Q AAGAGCCTCT AGTAATTAAA ATTACAGAGG CTCTAAAAAT ACAT CTAAAG GAGTGTCGTA 960 

TGAATCGGCA GGTTATAGAA TTTTCTAAGT ATAATCCTTC GGGGAATATG ACGATACTTG 1020 

TTCATTCAAA ACATGATGCT AGTGAATATG CATCTATCGC CAATCAGTTG ATGGCCGCAA 1080 

45 CACATGTATG CTGTGAACAG GTAGGCTTTA TAGrATCAAC ACAAAATGAT GATGGTAATG 1140 

ATTTTCACTT AGTTATGAGC GGTAATGAAT TTTGCGGTAA TGCGACGATG TCATATATAC 1200 

ATCATTTGCA GGAAAGTCAT TTGCTTAAAG ACCAACAGTT TAAGGTGAAG GTGTCTGGCT 1260 

SO 

GTTCGGATTT AGTGCAATGC GCAATTCATG ATTGCCAATA CTATGAAGTT CAAATGCCAC 1320 

AAGCCCATCG TGTTGTGCCA ACAACAATTA ATATGGGTAA TCATTCATGG AAAGCAATAG 13 8 0 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG 
TAGGTATGAT GCTTTTTGAT GAACAACGTC 
AAATTCAAAG TTTAATTTGG GAAAATAGCT 
TTAATAATTA TCAACGTAAT GACGCATGCA 
GTATTTTAGT GACATCAAAG CGATGTCATC 
AGGTTACAAC TGTAGCTACA GGaAAAGCAT 
TTTAATAATG AAATCAAATT GATATTACAA 
GAGCGTGTAT TACAAGACGA TCAATATATC 
AGTGAATTTA TTTTAAATCC TATTTATGAA 
GAAAAAGCAC AATTaATAAA ATCACTGCAA 
GAAGTCATTA GAGCGAGACG TCTATTAGAC 
AATATAGAAC ATTGTATTGA TGAAGAGTTT 
TTATTGTTAG TTGGTTCAGG TGCATATCCA 
GGTGCTTCAG TTATCGGTAT TGATATTGAT 
GTTAACGTCT TAGCACCAAA TGAAGATATA 
AAAGATATCA AAGATGTGAC GCATATCATA 
ATTTTAGAAG AATTATATGA TTTAACAAAT 
GATGGCATCA AAGCAATATT TAATTATCCG 
TGTGTGAATA AACATATGAG ACCACAGCAA 
GCTATAAAGG TAGGTATTAC GGATGTCTAA 
AATGCAATTA GCGAATATTT GCTATTTAAA 
TGCCTCAACA TCAGAAAAAT CAAAACGCTT 
TGAAGTCAAA ATACAAAACG AGGCGCATCA 
TTTGTATAAA GATGTTAAAA ACGTTAAGGG 
AGCAGATGCT TATTATGACA CACTACAGCA 
ACATGTCATT TTAATATCAC CGACATTTGG 
TAAATTTAAT AAAGATATCG AAGTGATTTC 
TGTTGATAAA GAAGCGCCTA ATCATGTGTT 
GGGATCGACA CATTCAAACT CAACAATGTG 



AgcAACAATG GAGTCACAAA TATAAAACAG 
AATTTTTACA GCCATTAATC TATATACCAG 
GTGGTTCTGG TACAgcATCA ATTGGGGTTT 
AAGATTTTAC AGTACATCAG CCAGGGGGCA 
AATTGGGATA TCAAACTTCA ATTAAAGGAC 
ATATAGAATA AGGAGCCTAC AATGAATAAC 
CAATATTTAG AAAAGTTTGA AGCGCATTAC 
GAAGCATTAG AAACATTGAT GGATGACTAT 
CAACAATTTA ATGCTTGGCG TGACGTTGAA 
TATATTACAG CGCAGTGTGT TAAACAAGTG 
GGACAGGCGT CTACCACAGG TTACTTTGAC 
GGACAATGTA GTATAGCTAG CAATGACAAA 
ATGACGTTAA TTCAAGTAGC AAAAGAAACA 
CCACAAGCCG TTGACCTAGG GCGCAGAATC 
ACAATTACGG ATCAAAAGGT ATCTGAACTT 
TTCAGCTCGA CAATTCCTTT AAAGTACAGC 
GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 
TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 
ATTTTTGATA TAGCACTTTA TAAAAAAGCA 
ATTATTAATG ATAGGCACTG GTCCgGTCGC 
ATCAGATTAT GAGATTGATA TGGTTGGACG 
ATATCAAGCG TATAAAAAAG AGAAACAATT 
ACATCTGGAA GGTAAGTTTG AAATTAATCG 
TGAATACGAA ACGGTTGTCA TGGCATGCAC 
ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 
TTCGCAAATG ATTGTCGAAC AATTTATGTC 
ATTCTCAACT TATCTTGGCG ATACACGTAT 
GACAACAGGT GTAAAAAAGA AATTGTACAT 
TCAACGAATC TCTGCTTTAG CTGAGCAATT 
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TTATGTGCAC CCACCACTAT TTATGAATGA 
AGATGTACCG GTTTATGTGT ATAAGTTATT 
5 CCGTGAAATG CGTTTAATGT GGAAGGAAAT 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA 
GGATGAAGGT GATATTGAGC ATTTCGAAAT 

10 

TTATGTAAGA TATACCGCAA TCCTCATTGA 
TTACTTTGAT TTTTCAGCTG TACCATTTAA 
TCAAATTCCA AGAATGCCAA GTGAAGATTA 

15 

GAAAATGCTA GGTATCAAAA CGCCAATGAT 
TTGCCAGGCG TACAAGGATA TGCATCAAGA 

20 TCTATTTGAA GGAGATAAAG CACTCGTCAC 

ATAATAAGGG TTTGAAGTTT TATAATAGAA 
ATAAAAATAA GCAAATAATT GAGAAAAATA 

25 TATCAATTTA GAAAGAGGAA AAGCAAATGA 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG 
AAAACAAGCA ATTAACGTAT ACGACGGTTA 

30 

ACGGTGGATC AATGTCTGCT GAAAGTATGA 
ATGGTATTAA GCCTTTACTA GCTAAAAAGT 
CGTTCCATTT GAGAGATGAC GTTAAATTCC 

35 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA 
tcgAcattaa TTGACAATGT TAAAGTTAAA 
GAAGCATATC AACCTGCATT GGCTGAATTA 

40 

CCAAAAGACT TTaAAAACGG TACAAcAAAA 
CCATTTAAAT TAGGTGAACA CAAAAAAGAT 
45 TACTGGGGCG AAAAGTCTAA ACTTAACAAA 

ACAGCATTCC TATCAATGAA AAAAGGTGAA 
ACAGATAGCT TAGACAAAGA CTCTTTAAAA 

50 

AAGCGTAGTC AACCTATGAA TACGAAAATG 
GCTGTGAGTG ACAAAACAGT CAGACAAGCG 

55 



CTTTTCATTG AAAGCCATTT TCGAAGGAAC 3300 

TCCTGAAGGA CCGATAACGA TGACACTAAT 3360 

GATGGTTATT TTACAAGCAT TTAGAGTGCC 3420 

GGAAAATTAT CCAGTACGTC CTGAAACTTT 3490 

CTTGCCAGAT ATCTTACAAG AATATCTGCT 3540 

TCCATTTTCA CAGCCAGACG AAAACGGACA 3600 

GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3660 

TTACAGAACG GCGATGATTC AGCATATTGG 3720 

TGATCAGTTC CTAACTCGCT ATGAAGCAAG 3780 : 

TCAACACTTA TCTTCTCAAT TTAATACAAA 3840 

AAAATTTTTG GAAATCAATA GAACGCTTTC 3900 

AAAAATTATT GAATTATGTT TGACATTTAC 3960 

ATCATTACGA TTTGATTAAG TAATGCAACT 4020 

GAAAACTAAC TAAAATGAGT GCAATGTTAC 4080 

GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 414 0 

AAGATATCGG TGATATGAAT CCGCATGTTT 4200 

TATACGAGCC GCTTGTACGT AACACGAAAG 4260 

GGGATGTGTC TGAAGATGGG AAGACATACA 4320 

ATGATGGTAC GCCATTTGca TGctGACGCA 4380 

AACAAAAAAT TGCATTCTTG GTTAAAGATT 4440 

GATAAGTACA CGGTTGAATT GAATTTGAAA 4500 

GCGATGCCTC GTCCATATGT ATTTGTGTCT 4560 

GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4620 

GAGTCTGCAG ACTTTAACAA AAATGATCAA 4 680 

GTACAAGCAA AAGTAATGCC TGCTGGTGAA 4 740 

ACGAACTTTG CCTTCACAGA TGATAGAGGT 4 800 

CAATTGAAAG ATACAGGTGA CTATCAAGTT 4 860 

TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4920 

ATTGGTCATA TGGTAAACAG AGATAAAATT 4980 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 

TTAGATGAAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 5160 

AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 5220 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 5280 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 5340 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 5400 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAG CATTGA TGACGCATTT 5460 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 5520 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 5580 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 5640 

AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5320 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5380 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 5940 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 6000 

30 ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 612 0 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 6180 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATG CTGGT AT TTACTTTAGA 6240 

AATGTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 6300 

* 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 6360 

GTATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 6420 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 80 

45 CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 654 0 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6500 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 6650 

50 TCTTTTTAGG ATTAGCAGCA CCACTTGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6720 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6780 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6900 

TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6960 

5 TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATATTATC ATGGCATTTA 7020 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7080 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATT ATT CACA 7140 

10 

AACATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 7200 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA 7320 

15 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 7440 

AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

20 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 7680 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGACATT GCGATGGTCA TGCAACAAGG 7740 

TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7300 

30 ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 7860 

AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACCCT TACATGTTAT CAGGAGGAAT 7920 

GTTACAGCGA TTGATGATTG CTTTAGCGTT AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7 9 80 

35 TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 8040 

TATTMAAAA CACTTTGACT GTGCGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

40 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 8340 

45 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 8460 

SO TAACCTTAAA TGATCAACCG ATGCATAAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8520 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 8580 
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TGTTGGAAGA AGTCGGTCTA TCTAAGGCAT ACATGGATAA ATATCCTAAT ATGTTATCAG 8700 

GTGGAGAGGC GCAACGTGTT GCGATTGCGC GTGCAATATG TATTAACCCT AAATATATTT 8760 

TGTTTGATGA AGCCATTAGT TCACTCGACA TGTCAATTCA AACACAAATA TTAGATTTAT 8320 

TGATTCATTT ACGTGAAACG CGTCAGTTGA GTTATATTTT TATCACACAT GATATTCAAG 8880 

CTGCCACGTA TTTATGTGAT CAATTAATTA TTTTTAAAAA CGGAAAAATA GAAGAACAAA 8940 

TTCCGACAAG CGCATTGCAT AAAAGTGACA ATGCTTATAC AAGAGAATTA ATAGAAAAAC 9000 

AACTATCATT CTAAGGAGTG AGATAATGAA AGGTGCAATG GCTTGGCCCT TTTTGAGATT 9060 

ATATATATTA ACATTGATGT TCTTTAGTGC CAATGCAATC TTAAACGTGT TTATACCTTT 9120 

ACGAGGGCAT GATTTAGGCG CAACGAATAC GGTTATCGGT ATCGTTATGG GGGCATACAT 9180 

GTTAACAGCA ATGGTATTTC GACCATGGGC AGGACAAATT ATTGCTCGTG TCGGTCCCAT 9240 

TAAAGTATTA AGAATTATTT TGATTATCAA TGCCATAGCT TTAATTATTT ATGGTTTTAC 9300 

TGGCTTAGAA GGTTATTTCG TAGCACGTGT TATGCAAGGT GTGTGTACGG CATTCTTTTC 9360 

TATGTCTTTA CAGCTAGGTA TTATTGATGC ATTACCAGAG GAACATCGTT CTGAAGGTGT 942 0 

ATCATTGTAC TCGCTATTTT CAACGATTCC AAACTTAATC GGACCATTAG TTGCCGTAGG 9480 

TATTTGGAAT GCAAATAATA TTTCACTATT TGCAATTGTC ATTATCTTTA TCGCATTAAC 9540 

AACAACATTC TTTGsTATCG CGTGACCTTT GCTGAACAGG AACCCGATAC GTCAGATAAG 9600 

ATTGAAAAAA TGCCGTTTAA CGCTGTAACT GTTTTTGCGC AATTTTTCAA AAATAAAGAG 9660 

TTGTTGAACA GTGGTATTAT CATGATTGTT GCATCGATTG TATTTGGTGC AGTTAGTACA 9720 

TTTGTAC CGT TATACACAGT GAGTTTAGGA TTCGCGAATG CGGGAATCTT TTTGACAATA 9780 

CAGGCCATCG CAGTTGTTGC GGCAAGATTT TACTTAAGGA AAT ACATTC C GTCAGATGGT 9840 

ATGTCGCATC CTAAATATAT GGTATCTGTA CTATCATTAT TAGTAATCGC GTCATTTGTA 9900 

» 

GTGGCATTTG GTCCGCAAGT AGGTG CAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 9960 

ATGACGCAAG CAATGGTGTA CCCAACATTA ACATCATACT TAAGCTTCGT CTTACCAAAA 10020 

GTAGGTCGTA ATATGTTGTT AGGTTTATTT ATTGCCTGTG CAGACTTAGG TATATCGTTA 10030 

GGTGGCGCA7 TGATGGGACC TATTTCCGAT TTAGTAGGAT TTAAATGGAT GTATCTAATT 1014 0 

TGTGGTATGT TAGTCATTGT AATAATGATT ATGAGTTTCT TGAAAAAGCC AACACCACGT 10200 

CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAGCATATTA AGTTAATGAA TATTTAAATT 10260 

TTAAAAGGTA TATTGaGCAT GGCGATTCAT GTGCTTCATG CTAGGACATG AAACATTCTA 1032 0 

TATGGCTCGT TTTTAGAACG ACAtATATCT AAATAAAGCA CGCTTArAAG TGAGTTTTGA 10380 
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TTGCTAAAAT 


CCGAAATGTT 


GAACCTTATA AAACAATCAA 


TTCACCTAAC 


CGTTACGAAT 


12300 




TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT TCGTCGATCA 


GTTGCAGCAA 


TTAGGTCAAA 


12360 


5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA AAGTTTCAGA 


AATTGAAACA 


CTTGTACGTA 


12420 




CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT TTATTACGAT 


TGATGGTGGT 


GAAGGTGGTA 


12480 


10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATG GTGTTGGCTT 


ACCGCTATTT 


ACAGCTCTAC 


12540 


CTATTGTGTC 


TGGCATGTTA 


GAAAAATATG GTATTCGAGA 


TAAAGTGAAA 


TTGGCGGCAT 


12500 




CTGGTAAGTT 


AGTGACACCA 


GATAAAATTG CGATTGCACT 


AGGTTTAGGT 


GCAGATTTTG 


12660 


15 


TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG TCGGTTGTAT 


AATGAGTCAA 


CAATGTCACA 


12720 


TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA CAGATGCGAA 


GAAAGAAAAA 


GCATTGATTG 


12780 




TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT ATGTAACAAG 


TTTGCATGAA 


GGCTTATTCA 


12840 


20 


ATATTGCAGC 


AGCTGTTGGC 


GTATCCAGTC CTACAGAAAT 


TACTGCTGAT 


CATATTGTAT 


12900 




ATCGAAAAGT 


CGATGGTGAG 


TTACAAACGA TACATGATTA 


TAAATTAAAA 


CTCATTAGTT 


12960 




AACTTAATTA 


TTTCGGGAAA 


TTGAAAGCAG CGGATTTTAG 


CGTTACTGCA 


AATAATTTTA 


13020 


25 


TATTAGTAGT 


GGATGCTGGT 


CACACAAGAA CTTCAAATAT 


TAAAGCCCTC 


AGAATATGAA 


13080 




TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG GGCATTTTTA AGTTATAAAC 


TATTTGTCGT 


13140 




CCATTTTATC 


TTTTTCTTTT 


AAACCTCTGT GCTTTAATTG 


CTTTTCAAGT 


TTT7CAAAAC 


13200 


30 


TAATATCTTT 


ATTTTCTTTA 


GTCGAAACAC CAAGACGTTT 


ATTTAATTTT 


TTCATGTCAA 


13250 




CTTCTGTGTA 


ATCTATGTCT 


AAGTGyTCAA TTGCTTTTTT ATCTTTATAG TCTACTTTGT 


13320 


35 


ATTTTACGCC 


TTTAAGGTCT 


TTGAAAATAC TTTCAGATTT 


GGCGAATAAC 


TTTTTGGCTT 


13380 


CGTCTTTATC 


CATACCTAGA 


TCGTCATATT TAATTGTGTT 


GATTGTAGAC 


TGTTTTAAAA 


13440 




CTTTATCATC TTTATATGTG ATAGAAGTTA GTACATGTTT ACCACTAACA 

* 


TCACCWTCAT 


13500 


40 


ATGTTTTGGT 


TTGTTCTT7A 


CCACAAGCTG ATAATGCAAT 


GATACAAACT 


AATGCTACTA 


13560 


CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA GTCGCCTTCT 


TTCGATATTT 


GTATTATAAA 


13620 




GAAATTATAA 


CATTTACTAA 


AAAATGATGT TATTCAAAAA 


TTTAAATTTT 


GTCATTTTTT 


13680 


45 


TTGAAGATAT 


GAGTTTTTTT 


AAGCGGATTC CTCACAAAAT 


TTTAAAAATA 


TTTAAGCCTJc 






AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13800 




GAAATACAGG 


AGGATGAATA 


ACATGAATCA GTCAGTCAAA TTACTTAAAC 


ATTTAACAGA 


13860 


50 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 




GCCTGTCAGT 


GATCAAATTA TTGAAGATAA CTTGGGTGGC ATTTTTGGAA AGAAAAATGC 


13980 
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AACAAAGATT GATAAACATG GTTTTATTTC ATTTACGCCA kTgGTGGATG GTGGAATCAA 14100 

GTCATGCTAT CTCAAAAAGT AACGATTACA ACAGATTCGG GCAAAGAAAT TAGAGGTATC 14160 

ATCGGTTCTA AACCGCCACA TGTCTTAACG CCTGAAGAAC GTAAAAAGCC AATGGAAATC 14 220 

AAAAATATGT TTATAGATAT TGGTGTTAGT AGCAAGGAAG AAGCTGAAGA AGCTGGCGTT 14 280 

GAAGTAGGCA ATATGGTTAC GCCATATAGT GAATTTGAAG TGCTTGCAAA TGATAAATAT 14 340 

TTAACTGCGA ArCATTTGAT AATCGCTATG GCTGTGCATT AGCTATTGAG GTATTAAAAC 14400 

GTTTAAAAGA TGAAAATATT GGCATTAACT TATACAGTGG TGCCACAGTG CAAGAAGAAG 14460 

TTGGTTTGCG TGGTGCGAAA GTGGCAGCGA ATACGATTAA ACCAGACTTG GCGATAgcTG 14520 

TcGATGTAGG TATTGCTTAT GATACCCCAG GTATGTCAGG TCAAACGAGC GATAGTAAAC 14580 

TAGGCGGTGG TCCAGTTGTC ATTATGATGG ATGCTACAAG TATTGCTCAC CAAGGTTTGC 14640 

GAAAgcATaT TAAAGATGTA GCTAAGGAAC ATAACATCGA AGTACAATGG GATACGACAC 14700 

CAGGTGGAGG TACAGATGCG GGAAGTATTC ATGTCGCAAA TGAAGGTATT CCAACGATGA 14760 

CAATCGGTGT TACGCTGCGA TACATGCATT CTAATGTTTC AGTGCTCAAT GTAGATGATT 14820 

ATGAAAATTC TATCCGTCTT GTTACTGAAA TTGTCCGTTC ATTGAATGAT GAAAGTTATA 14880 

AAAATATCAT GTGGTAATCA AATCCATAAA TAATAAAGAA TCCTTTTAAT ATGGTAGGTT 14 940 

GTTAAACAAT TGTCTAATTT TAATTCTTAG TCATTAGACA GTATCCATGT TAATAGGATT 15000 

TTTTGTTTTT AATTTAAATG CTGAAAATCA ATTATGCCTA AATTTTGATA TTACAAGAAA 15060 

ATGATTTTTT CTTAAATGTA ATTGCACTAA AAACCAAAAA AACGGGAATA ATATACCTGA 15120 

TATATTACAT GAGGAGCGGT GCAAATGTTG TTAGAAATTA AAGATTTAGT GTATAAAGCG 15180 

AGCGATAGAA TCATACTAGA TCATATCAGT CTAAAAGTAG ATAAAGGCGA GAGTATTGCC 15240 

ATTATAGGTC CATCAGGTAG TGGTAAAAGT ACATTTCAAA AGCAAATATG TAATTTGTTT 15300 

AGTCCAACTA GTGGAGAACT TTATTTTAAA GGTAAACCCT ATAATGATTA TGACCCGGAA 15360 

GAATTGCGTC AACGAATCAG TTATTTGATG CAGCAAAGTG ACTTGTTTGG TGAAACGATT 15420 

GAAGATAACA TGATATTCCC ATCACTTGCA CGTAATGATA AATTTGATAG AAAACGTGCA 15480 

AAGCAATTAA TTAAAGATGT CGGTTTGGGA CATTATCAAT TAAGTTCGGA AGTGGAAAAT 15540 

ATGTCGGGTG GTGAGCGGCA AAGAATTGCT ATAGCGCGCC AACTGATGTA TACACCGGAT 15600 

ATTCTTTTAT TAGATGAATC GACCAGTGCA TTAGACGTTA ATAATAAAGA AAAGATAGAA 15660 

AATATCATTT TTAAATTAGC AGATCAAGGC GTGGCAATTA TGTGGATTAC CCACAGCGAT 1572 0 

GACCAAAGTA TGCGACACTT TCAAAAGCGT ATAACAATTG TTGATGGTCA AATTTCTAAT 15780 
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CATTCCGATT ATCATTTCAT ATAAAGAAGG TTTACATATT ATTAAAGATT TAATTGTTGC 15900 

GACATTACGA GCAGTTGTGC AATTAATCAT TTTGGGATTT TTGCTGCATT ATATTTTTAA 15960 

AATAAACGAT AAATGGCTGC TTATTTTATG TGTATTGGTC ATTATTATTA ATGCATCATG 1602 0 

GAATACAATT AGTCGAGCAT CACCAGTGAT GCA7CATGTG TTTTGGATAT CATTTCTAGC 1608 0 

TATCTTCATT GGAACGGCAT TACCGCTTGC AGGTACTATT GCGACAGGGG CCATTCAATT 16140 

TACCGCAAAT GAAGTTATAC CTATCGGCGG CATGCTTGCA AATAATGGCT TGATTGCAAT 16200 

TAATTTAGCT TACCAGAATT TAGATCGTGC ATTCGTACAA GATGGTACTA ATATTGAATC 16260 

TAAATTATCA CTTGCAGCTA CACCTAAATT GGCTTCTAAA GGTGCAATAC GTGAAAGTAT 16320 

TCGTTTAGCT ATAGTGCCAA CTATTGATTC GGTTAAAACA TATGGGCTTG TGTCGATTCC 16380 

TGGTATGATG ACAGGCTTAA TTATTGGTGG CGTACCACCT TTACAAGCGA TTAAATTTCA 16440 

ATTGTTAGTC GTGTTTATTC ATACAACTGC GACCATTATG TCTGCTTTGA TTGCGACATA 16500 

TTTAAGCTAT GGTCAATTTT TCAATGCAAG ACATCAATTA GTAGCACGAA ATACTGATGT 16560 

TAAGAGTGAA TCATGATAGA TTTTACTGCA TCAGATTTAG GCATTAGTTT TAATTGGAAA 16620 

TGAAGTGACG CGCACATATA GTATCGCTAT TCATTAGCGC AGCGAAAATA TTCATAAAGG 16680 

CACGCATACT TTGTAGTCAG TTATCTGTTC TGACATATAA AGCGTGCGTG CTTTTTTGGA 16740 

GTTATTGTTG AAACTGAAGT AATTATACAT AATTATTAAA TGACATACTT GTGTTAATTT 16800 

TTCAAATACT GAAAAACAAT TTCaATAATT TTCCaATTAA GCACAGAAAA TTAAAGCAAA 16860 

ATATTATATA ATAGAACGGT TATATATaAA nATTr.gTgCA CACATTTTTT AATAAATCGT 1692 0 

TATTCTAAGG GAAATGAATA TCGGAAATTT TGTTTGAAAG GAGTTTTAAA TTGTCAATCA 16980 

TGCGACTATT TACATTCATT TTAAGTATTT TTATCGTAGG AATGGTTGAA ATGATGGTTG 17040 

CAGGAATTAT GAACTTGATG AGTCAGGACT TACATGTATC AGAAGCTGTC GTTGGTCAAT 17100 

» 

TAGTGACAAT GTACGCTTTA ACATTTGCGA TATGTGGACC TATTCTGGTT AAATTAACGA 1716 0 

ACCGTTTTTC ATCAAGGCCT GTATTATTAT GGACATTACT TATATTTATC ATTGGTAATG 17220 

GCATTATTGC TGTAGCGCCA AATTTTTCaA TATTAGTAGT TGGTAGAATT ATCTCATCTG 1728 0 

CAGCAGCAGC ACTAATTATC GTAAAAGTAT TAGCTATTAC AGCGATGTTA TCAGCACCTA 17340 

AAAATCGTGG TAAAATGATT GGACTTGTCT ATACAGGGTT TAGTGGTGCT AATGTTTTTG 17400 

GTGTACCAAT TGGAACGGTT ATCGGCGATT TAGTAGGTTG GCGCTATACA TTTCTATTCT 17460 

TAATTATTGT GAGTATTATT GTTGGCTTCT TGATGATGAT CTATTTACCG AAGGATCAGG 17520 

AAATACAACG AGGCCCTGTG AAT CAT GAGA CACCATCTCA TGAAAATCAT GTTACTTCGA 17580 
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CAAACTCAGT GACATTCGTC TTTATAAATC CACTTATTTT ATCTAATGGT CATGATATGT 17700 

CATTCGTTTC ATTAGCACTT CTAGTAAATG GAATCGCTGG CGTTATTGGA ACATCATTAG 17760 

5 

GTGGTATATT CTCCGATAAA ATTACAAGTA AGCGTTGGTT AATGATTTCT GTTTCTATTT 17820 

TTATCGTCAT GATGTTACTT ATGAATTTAA TCTTACCTGG TTCAGGTCTA TTGTTAGCAG 17880 

GACTATTTAT TTGGAATATC ATGCAATGGA GTACTAATCC AGCAGTGCAA AGCGGTGTGA 17940 

10 

TTCAACATGT TGAAGGCGAC ACAAGCCAAG TAATGAGTTG GAACATGTCT AGTTTAAACG 18000 

CTGGTATTGG TGTTGGAGGC ATTATTGGAG GCTTGGTCAT GACACATGTT TCTGTTCAAG 18060 

CTATCACATA TACGAGTGCC ATCATTGGCG CATTAGGATT AATCGTTGTT TTCACATTGA 18120 

To 

AAAATAATCA TTATGCTAAA ACATTTAAAT CATCATAATT CTCATATGAm AAGCACGCCT 18180 

GCTATCAAAT TCAGGTGTGC TTTTTTAGAT GCGATAACGT TATTGATATG TGCGATAATA 18240 

20 GCGACGTTCA TTATGATACA TCGGCCAAGG CATTTTACCG CTTTTAGCAA AATTAGCTAA 18300 

ATCATTTTGC ATTTGTCGAC TTAAAAATTT AAGGTGaGCA GTTGTTGGaT ATgAT 1B355 
(2) INFORMATION FOR SEQ ID NO: 68: 

25 . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
CGCAAAGAAG TACAAAAAAT GTTTTTACAA GAAGGTATTA AAACACCTCA ACCAATTATG 60 

35 

ACTGCTTATA ATCATAGTGA AAACGgTGTT TAGTAGTTTA TAATACATGG AGGTCATATT 120 
TAATGGCGTC AAAATATGGA ATAAATGATA TAGTAGAAAT GAAAAAACAA CATGCGTGTG 180 
GAACAAACCG TTTTAAGATT ATTAGAATGG GTGCAGACAT AAGAATTAAA TGTGAAAATT 240 

40 

GTCAAAGAAG TATTATGATT CCACGTCAAA CGTTTGATAA AAAACTTAAA AAAATCATCG 300 
AATCTCATGA TGATACACAA AGATAGGAGA ATGATTAATG GCTTTAACAG CAGGTATCGT 360 

45 TGGATTGCCA AACGTTGGTA AATCAACATT ATTTAATGCA ATAACAAAAG CAGGTGCTTT 420 

AGCAGCGAAC TATCCATTCG CTACGATTGA TCCTAATGTA GGGATAGTAG AAGTGCCAGA 4 80 

TGCTAGATTA CTTAAATTAG AAGAAATGGT TCAACCTAAA AAGACATTGC CGACTACATT 540 

50 TGAATTTACA GATATCGCTG GTATTGTGAA AGGTGCTTCA AAGGGAGAAG GGTTAGGTAA 6 00 

TAAATTCTTA TCACATATTA GAGAAGTAGA TGCGATTTGT CAGGTCGTTC GTGCATTTGA 660 
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TAATATGGAA TTAGTACTAG CGGACTTAGA ATCTGTTGAG AAACGTTTGC CTAGAATTGA 
AAAATTAGCA CGTCAAAAAG ATAAGACTGC TGAAATGGAA GTACGTATTT TAACAACTAT 
5 TAAAGAAGCT TTAGAAAATG GTAAACCCGC TCGTAGTATT GACTTTAATG AAGAAGATCA 

AAAATGGGTG AATCAAGCGC AATTACTGAC TTCTAAAAAA ATGCTTTATA TCGCTAATGT 
TGGTGAAGAT GAAATTGGTG ATGATGATAA TGATAAAGTA AAAGCGATTC GTGAATATGC 

10 

AGCGCAAGAA GACTCTGAAG TGATTGTTAT TAGTGCAAAA ATTGAAGAAG AAATTGCTAC 
ATTAGATGAT GAAGATAAAG AAATGTTCTT AGAAGaTTTA GGTATCGaAG AACCAGGATT 
AGATCgrTTA ATTAGGAmCA ctTATGAATT ATTAGGnTTA TCCACCATAA TT 

15 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH; 74 94 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



25 . (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

AATATAGCTG CAATAGCATC TCGTTTCATT TGTATAATCA ATTCCGGTTT AAATATCAGT 60 

GTGAACGTAA GCACGACACA GATTAAAAAT AACACTGCCG GAATGAGTCG TTTCAATCGT 120 

30 CGCTtCCAAA ACTCTAGCAA ATCGATTTTT TGCGTCCGAT AATACTCACT TATCAACAAA 180 

CTTGTTATTA AATAACCTGA AATAACGAAG AATGTATCTA CTCCTAAAAA GCCCCCACTT 24 0 

AACCATTGTG CATTCAAGTG ATAAATAATG ATTCCTATAA CTGCGAATGC CCTCAATCCA 300 

35 

TCTAATCCAG GTAAGTATCG CGGGG AATAC ATTTTTTCTA AACGTTTAAA GTCTTTTGTA 360 

TCCAfeTTAA TAAACGCCCC ATTTATTTTT CTCTATTTTG TAGTATATCA CAATATTTTT 420 

GAAAATAAAA TATTGCACTG aTTTTCATTA ATTGATTTAA CCCTTAATTA AGATAGTTTT 430 

40 

AAATTTTTTA TTAAGTAGAA AACAATTATT ACAGTTGATT TCATTACTGC AAAC CACATA 540 

TAAATTTGTC GATTTTACTA CATAACATAG ATTATCATAG ATTCTTGAAT TTTTAGCAAA 600 

45 ATAACTGTTA TTTTCATTAT ATTTTTACAA AAAAAGGTTC GTTTTATATT TTATGCATCT S60 

TACTGTAACA GAATCATTAA GATATGCTAT TCGAATATAC TTTTTCAAAA TTTATATAAT 720 

GAATAAATTA ACATGTATTG AAAAAAAAGC GAAATGCAGC CTATCCTCTA ATGTAAACCA 780 

SO AACGATATAT CTCGTCAGAC TTTATATTTA AACGCTATGT GTCACTTTTA AAATGAATAT 84 0 

TACTAAGATT GTCATATCAA TTATTATTGC ATCGAATTAA TCTTTTAAAT TTCTGTAATA 900 
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ACGGAAG7CA TTATTAGAAT AAAAATACTG 
ATAAATACCA TCGATATTTT GTTCTTTACA 
5 ACCTAT7TTT AAATTCCTAT AACCTTTATC 

ATTATCTAGT CTAATCAAAC CTATAGTACC 
CCAAAATgCC CTCCATTATT CAAATAGTTA 
TCTCTATCAA TTTTTATATa AATTCATTTT 
CTTCCTTATA AGACCTATTA TATTCAATTA 
TTTAATTTAA TTATAAAATG TGCGTTTAGT 

15 

ATGAAGTTTC ATTCCACTTG GCACTTAATA 
AGTCCAATAA ATTTCCCTAA CTTCAATATC 

20 AAAAAAACTC TCCCCAATTT CTATGGGAAG 

TTATTTATTA TGAAGGAATT AGAATCCCCA 
AACAGCTGCG TTGATTTGTT GGTCAACAGT 

25 GAATAAACCT GAAGCACCTG ATGGGTTGTA 

GATGATTGCA GCCCATGTAG AAGCTGAAAC 
TGATGAACCA GTAGCACCTG CAGTATTACC 
TGAAGTGCTG TAGTTATGGT AAGTTGGAGC 
TTGTGCATTG TAGCTTACTG ATTGTACATT 
TGCACCTGCA ACGTTTGAGA AACCAGCAGT 

35 

CCATGTAGTA CCATTTGAAG TGAAGTTATA 
ATATCCACCA TCTTTGATTG GAGCTGCATT 
AACTAAGTGT GCTTGATCAA CGTTTACTTC 

40 

TGCGTAACCT GTTACACCTA ATGCCACTGC 
AGTAAAAAAT CCTCCAGTAA TAATTGTnAG 

45 TGAATGTCGT AGTgCAAGTT TAAATTGTCT 

ACAaAAAACC AGCCAGTAAA TTACACTTTC 
TTGtAATGTT GAAATATGGC TGTTTTATAC 

50 CAATCAACCC TTGAAATAGT CTTTAACACA 

TACAGCTTTC GTAATATTAC AGATTGTATT 
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TGCACTAATA AATTTATCAA TTGTTCCTAA 1020 

TGTCATTATA ACTTTATCTA AAAGTTTTTT 1080 

AACAAACATT TTTTTAAGTG CAGACATATT 1140 

AACAATATTT TGaTGATTGT TTATTGCAAG 12 00 

TGTTCGATGT TCTCCAAATC AGGTTGATCA 1260 

TTTGAATCGA TAAAATAAAC TCGATTAGCT 1320 

TGTTTATAGC CATTTTTATC TCCTTTTTCA 1380 

TTGTA7CTAG TGTACTCAGT ACAGCCTCAA 1440 

AAGACAAGTA TTTTAGCAGT AATACAATAA 1500 

CACTTTTTAA AAAATGTATT TTTAATTAAT 1560 

AGCTATATAT TTAATGTCTA AACATTACTT 162 0 

AGCACCTAAA CCTTGTGCTT TGTATGCTTT 1680 

GTTTGTTGGA CCCCAACCTG GCATAGTTTG 1740 

AGCATTTACT TGACCATTTG ATTCACGAGC 1800 

ACCAGTACGT TGAGCCATGA TTTGAGCTGC 1860 

ATTGCTTAAT CTCACTGAAC TTGAAGTAGT 1920 

TGAAACAGCT TCAACGTtTG AGTTACTTGA 1980 

TGAACCTTGG TTGTATGAAG TAGTGTAGTC 2040 

TTGACCATTA GCTGCTTCAT AGCTCCATGA 2100 

TTGGAAACCA TCTTTTACAA AGTGGATGTC 2160 

TAATTGATCT TGGTGATTAT GCGCTAAGTC 2220 

AGCAGCGTGT GCTTGATGTC CTGTACCTGC 2 280 

TAATGATGAT GCCATAATTG TCTTTTTCAT 234 0 

TTTATGTTTT TAGTAATTAT AtTTTGaATT 2400 

TTTATTTCTT TCaACGGTAC TCACTATATC 2460 

TTTACAAAAC ATTACAATAT CAAGTGTTAT 2 520 

TGTAATGTGA AATATGTGCC CTTTAGAATC 2580 

TAAGATTTTT ACTATATTTA GCTCAACTAT 264 0 

TTTGTTACAT AGCTGTAATA TATCTGACAT 2700 
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TACACATGTA TTGATTGCTA TTATTGTTGT ATATTCAAAG TTTTAAAACA CACATCTTTT 2820 

GTGAATTGTC TTATCTTTTA TTAGCGCAAA TAAACTGCAG CTCAATTATA TTGTTCAACT 2880 

TCATTCTCGC AATTCACAAT AACATTAAAT AATTTTTGGT CTCATATTTT CAAAAAACAT 2940 

ACTGTTATTA TCCCATGAAT TTAAAAATAT CATTAGTATA TAAACGAAAC ACTTTACGAT 3 000 

AAATGATATC TGCAAGCCAA GCTGTTACAA ATGGTACAAC AAAGAACGCT ACTACAATTA 3060 

GTAAGACACT CAACCAAGCA GAATCAACCT CCATAAATTT AAATGCATTA ATCGGTCCTA 3120 

CCATTCCTAT AAAACCAAAT CCAGCTGACT CTTTCGTTCC ATGAATACCT ACTAATGCTG 3180 

ATACCAAACC TGATACAATG GCTGTCGTTA ATATTGGTAA CATAAGAATT GGATATTTCA 3240 

CCATATTAGG TATCATCATT TTAACGCCTC CAAAGAAGAC GGATAACGGC ACCCCTAAAC 3300 

GATTCACTTT ACTTGTACCA ATTATCAATA CTGCTTCAGT CGCGGAGATA CCAATTGACG 3360 

2Q CTGATCCAGC TGCTAAACCT GTAATACCTA TCGCAAAGGC AATGGCCACA GTTGATAGTG 3420 

GCGAAATAAT AATAAGACTA AATACCATTG AAATCAAAAT ACTCATGACA ATCGGTTGTA 34 80 

ATTCTGTAAA ACCATTAACC ATATTACCGA TGGCTGTTGT AATCATTTTC GTATACGGCA 3540 

25 ATATTAAAAC ACCAATTGCA CCTGAAATAC CGCCAACAAC TGTTGGGAAT ACAATCAATG 3600 

CCATACTACC TACGCGATGT TGAATAAGTA AAATGAATAA CACTGCAATC GCTGCTGTAA 3660 

TCATTGTATT AATTAAATCA CCAATACCCG TAATCATCCA AGCACCATTT TTAAACTGCG 3720 

CTGCACCGCT TCCTACATAT GCTGCACTTG CCACAACAGC AATTGCTAAT GGCGATAGGT 3 780 

CAAATTTCAT GGCAACCAAT GCACCAATCA AAGCAGGTAC TGTAAATTGA ATTGCAACGA 3 84 0 

CAACGCCTAA TAACGTTTTA AAAATCGGAT GATAATCCAT AAAGTATTTA AAAATTTCTC 3 900 

CAAGTATCGC ATTAGGAACT AAACCCGCAA CAATACCTAT GGCGACACCT GATAAAACTC 3 960 

TAAATATAAA ATCTTTGGGT GTAATTGTTT TAATTGATGT CATAATATCA TCCTTCCATT 4 020 

TATGTATATA CATCTGTATG CAAATAATAA AGAGCCTTAA GTTATAAGCT GCCACTAGCT 4030 

TAAATTCTAA GATGTGCATG CCGATGTTGT TATATTTAGG CTAGCAGTAT CATCTATAAC 4140 

TCAAGACTAT GAAAAATAGT ATATCACAAA ATTCTGAATT TTTAGATAAA TAAATTGGCA 4200 

45 ATTTTTCAAA CATATTGTTA CAATACACTT TTATTTTATC TTCATTTTTA AAATGCATTA 4260 

ATACAATAGA AGAAAGACAT TCAAATGCTT ACCAAAAAGG TACATTATTT GTTAGGAGCG 4320 

TATCAGCaCT TACATATCAT CAACACAATT GACAATATAA TAGAAGATAC TGATAATAAG 4 3 80 

SO TGTTAAAACA ACAGATGTTA GGTAGTGAAC AAATGATGGA AAGTAAATCC ATAGATCCAA 4440 

GAATCGTTAG AACCAAACAA TTGCTTGTCG ATGCTTTTCT TAAAATTTCT AGAGAAAAGA 4500 
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TTTACGCTCA TTTCGCTGAT AAAGAAGACC TCCTAGACTA CACATTATCT GTAACCATTT 4 620 

TAAAAGACTT GAATGATAAT TTGAGCATTT CTAATGTCAT TAATGAAAAG GTTCTGCGTA 468 0 

ATATTTTCAT TTCAATTGCG AGTTATATCA AAGATGCTGC AAAGTCTTGC GAATTAAATA 4 74 0 

GTGAAGCATT TTGCAACAAA GCACATCAAC GTATTAATAA TGAATTAGAA GATATTTTTG 4 800 

CGATTATGTT AGAAAACAGC TATCCGGAGC ATCAACGAGA TATCATTGTA AATAGTGCGA 4860 

GTTTTTTAGC AGCTGGTATC TCAGGCTTAG CATTACATTG GTTTAACACG AGTCAAGAGA 4 920 

CAGCCGATGT GTTTATCGAT CGCAACCTTC CATTTTTAAT T CAT CAT AT A GCACATTTTT 4980 

AATAAAACTT GGTATTTAGT CATGCATCTT GAAATCACTA TGTGACTTAG GTT CATACTT 504 0 

GTACACACAA TAAAATTTAA CGTATTACGA TTGATTAGCC GTGTCTAGGA CATAAATCAA 5100 

CGTCCTATAC TCTACAATGT CATATTAGCA GTCGTTAACT GAATGAAAAT AAGCTTGTCA 5160 

20 TTAAAACATA TAGATTTTAG TGACAAGCAT TTTTGTTTTT GCGTACTTAA ACAACACTTC 522 0 

AGGCAATATG TTGTTTAGGC AACAAATGAT ATGTGCGTGT TTATTGGCAA ACGTACGACA 5280 

TAGTAGTATA GTATGTCTAA ACAACATATG TTGCATAGTT GATATGCGTT GTTTAAATAC 5340 

25 TAAGATAGGA GGGATTGACG TGAGCGAGAC AGATGAACCT CAGGGGTTTG AACGCACGCA 5400 

TAATATATTA AATATTAATC AGAGTAGTCT GGGTGTAGTG ACATACATTA CAAATAAATT 5460 

AAAGTCGACG TTGAAGCAAC ACATAATAAT TGCTCGTGGT AAAAAGCGAA TCGACTATCG 552 0 

ACTGTCGTAT AACTTTTACA TACGTATTAT GATAATGTAG AAATCAAGAA AATCGACTGT 5580 

GAATATACCT ATGCTATGCC CATTGCAATT TTAATAAGAC ACACGATGTC ATTCGACAAT 564 0 

GCTCATTTCT TTGCTCAGTT ACGTCATCCT GTCTTATAAA ACAACATTGC AGACATGTAT 5700 

ATCAAACGAC ACTTCAATAA CATCACTTTG CCcATCGTAC TACTAGTAAA ATCGTGTCTC 5760 

AAATCCCTTA TTTTAATTCC AAAAAtCTGC TGGTCAAAAG ACCGAGAAAC TAAAAACATT 5820 

ACTTAATGTG TTGATAAATT ACCATATAAA AATAATCTCA AAATATATCA ACACTTGATT 58 80 

CTAAGGAGGA TATGACAATA TGAAAATTTT AGATAGAATT AATGAACTTG CAAATAAAGA 594 0 

AAAAGTACAA CCACTTACTG TAGCTGAAAA ACAAGAACAA CATGCATTGC GTCAAGAcTA 6000 

45 CTTAAGcATG ATCCGAGGAC AAGTATTAAC AACATTTTCC ACAATAAAAG TGGTTGATCC 6060 

AATCGGTcAG GATGTCACAC CAGATAAAGT TTATGATCTT CGCCAACAAT ACGGTTATAT 6120 

TCaAAATTAA tATTTGCTCA CGAGGTATTG CACTTAAGGT GCCAACTGAC CTCATAAACA 6180 

AAGCCCATAC TGATTGAAGA CACTAATGTG tCsaCCATGG TGCACATTAC GCTTCATCTC 6240 

TGTATGGGCT TTTTATTTAT TCTTTTGAGA ATTTCATTTT AGCAGACCAA AAAATTAAAA 6300 
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TGAACGACTG TGCCACCCGC TTCTTTCACT TTATTCACCA ACTGGTCAAC TTCTTCATTT 6420 

GTGTTCACAC CTAGAGAAAT CATCACTTCA TTTGGTTCAG TATTAAGGCT TTGCTGACTT 6480 

5 

ACATTTTGAA AATGCTTGTn TTCTATTAAA ATTACGGkTG tTTGACCTAT tTGAATGCCG 6540 

ACCATTTTAT CTAACATTTG TGGGTTTCTA TTTATTTTAA ATCCTAACGC TTTATAAAAC 6600 

TGTGCGCTCT TTTCTAAATC TTGCACATGC AAATTAAACC ACATTGATTG AATCATGATT 6660 

10 

GCACCCCATT CATTACTTAT TATAGTTTTG GACTTTAAGC CAATCACTTA ATGATAATCT 6 720 

TGTTGGATTT ATTTCAGCCA TTAATTCAAA GTCTACTTCA TAACCTTTTT CTTCCAACCA 6780 

75 TTGCTTTTCT GCAACACCAC TAACAAATTC TCCTTCTATA ACAGTAGATT TACCTGTCAC 684 0 

TTCACTAAAA ATTGTTGCTG CTTCACTTAA TGTAACTTCA TCGGAACCAA TCTCTATTGA 6900 

TTGATGCGTA AAGCTTTGTG GATGTGCAAA AATATACGAT GCAATTTTAG CTATATCAAT 696 0 

20 AGAAGAAATC ATTGTGAATT TTATATTCGG ATTAATAAAT TCTGGTAATG TAATACGTTC 7020 

ATCTTCGACT TTAGCAATGC GTAAAAAATT ATCCATAAAG AATGATGGTT TGATAACTGT 7080 

TGCATTTATA TTAGATTCCA TTAATCTATT TTCTATTTTT GCTAGTACTT CAAAGTGTGG 714 0 

25 GCCAGTTCGA TTTCGATTAA CCCCTCCCGC AGTACTATAC ACAATATGTT GAATATTTTC 7200 

TTGCTCAGCT ATTTCAATTA TCTTCATACC TTGTCTTAAT TCTTCGCTAA CATCATCTTT 7260 

AACGATTGGC TGAATACTGT ATAAGCCATA CTTACCTTTC ATCGCTGATT GCAAACTAAC 7320 

30 

ATTATCACTC AGATCACCTT CArCGATTGA TAAATGCGGA TGTCCTATGT CTGAAAGTTT 7380 

ACGATTATnC TTATTTCTAG TTAATGCACT TACATACCAT CCATCCTCTA ACAACTGTTT 7440 

TACAACTGCA TTACCTTGCT TCCCTGTTGC GCCTATTACn AAAATATCTT TCAT 74 94 

35 

(2) INFORMATION FOR SEQ ID NO: 70: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11802 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDSDNESS : double 
(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

AATTTATTTC GCCGTCCCAC CCCAACTTGC ATTGTCTGTA GAAATTGGGA ATCCAATTTC 60 

TCTTTGTTGG GGCCCcGCCC CAACTCGCAT TGCCTGTAGA ATTTCTTTTC GAAATTCTCT 120 

50 GTGTTGGGGC CCCTGACTAG AATTGAAAAA AGCTTATTAC AAGCGCATTT TCGTTCAGTC 180 

AATTACTGCC AATATAACTT CGTAGATCAT AGAACATTGA TTTATTTCCC AGCCTATTCT 240 
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AGCAAAGGTA ATAATGATAT TAATAATGTA 
TAAAACATCA GAACCACTAA AAACAAAAAA 
5 GACCACTTTT CAAAAAAATC TCtTTTCaTa 

TTATATTCTC TTTTAAGTTT ATTATTCAAA 
ATAAACATTT CAACTACTTT TAAAAACCAA 

10 

ATAAGTGAAC ATAGTTCTTT AGTTATAATA 
GCAATTGGTT TTCATTTCCT CTTAAAGATA 

15 CTATATTTTT CAACTTATCT CTATATTTAT 

CCTCTTCTTC GTGAGTTAAT AAATGAAGCA 
TTAAATTCGG TTTTAAAATA TGCAAATCAT 

20 CTCGTTTTAA TTCAATTTCC ACACGCCATA 

TATCTTTACG TTCTTGTTTT TATTATAAAT 
AAAATATTTT GTTTCTGGTT TTACATTACG 

25 

ATCTGACAAT GCATAATAGT CATTTAAATC 
CGTAAAACTA ACATCGTCCA AATAACTGAT 
ATGCGAAAGC TTATTAGGAT TAAATTCAAC 

30 

TTTATTTTGT CATATTCAAT ATAAACTTTT 
TGTAAAATAT CCCAAAGCCG AATTTCAGGA 
35 GCGTTAGACA TGCTAAGATT CCCAACAATC 

GCTAGTGACA TCCTATGTCG ATTTAACCGG 
ACAAATGGAT GAAACGAAAT TCAAAACACT 

40 

TACCATTATG TTCCTACTAA AAAACyAAAA 
TAGGATACTA TGTAATAAAA ATTTACAATA 
GmATACCCAT ACAAAGAGGA TAAAATAAAA 

45 

CTCGAGGTTT AAATATTGGT GCCTTATTTA 
TCATTAACmT AATCCTTAAA GAGTTTTAAA 
SO TCATCAACTT TTAAATAATT CAATAATTTT 
AACTTTAATA AACTATTCAT TTTGACAGGA 
AATACTTTCT CGCTTTAnAC AAAnACAAAA 

55 



CAAAAAATAT AAATCAAATC GACATCCTTA 360 
GCACAAAATA AAATTAAATT TAAAATAAAC 420 
TTTCCACCCC TAATTTTAAT AAGCATTATT 480 
AGGAAAACAG AAATATCTTT CaATATTATT 54 0 

CAAAAAAATA CTTATTTTAA GTAGATGAGC 600 
ATTAATTCAA CCAAAAGTCG ATTTGTTTTT 660 
TTTTCATTAA ATCTGTCAAA TCAATAGACG 720 
TTTTAGTACG TCTTTCTAAA TTTCCCCATT 780 
TTGCTCGTTC TTGTATATTT TCAATCATTT 840 
CAAAACAATC TTTCCAACAA TCAACCATAT 900 
GAAATGTTGA ATCAATTTCA ACATCTGCAT 960 

CCGAATAAAC CTATCACTAT TACGCACACC 1020 

TCCATAAAAT ATAGTTTTCT TTACCGACTT 1080 

AAATTCAAAA TCAAAAGCCA AATCTAATCT 1140 

GATATTTTGT TTTAACCAAA GCACTTCATC 1200 

GCGCATAtAC GTCTATTCCA AAGAGTTGCT 12 60 

TCTTTAAGAG CTTTAGCTTT AAAGTTTGTT 1320 

TTAGTACTCA TAAAATGTGA AAGTCTCTCT 13 80 

GTTATAGCGT CAAAAGACAA TTTTGGAATA 14 40 

CTATTACCGG ATATTAGAGT ATCCAGTTTT 1500 

AAAAAATATG TTCCACTAAC AGCAAAAAAA 1560 

ATACTGGAGA ACAAATGTCA GGATATAACT 1620 

AAAAAACAGG AAAACAAATT TCAAGTAAAA 1680 

AACCTCGAAC TGaAATGATG ATCTTTTCAG 1740 

TATAGATTCG TTATATTATA TTCTCTATTT 1800 

TTAATACCTG CTAGATGATT CAAAAATGTT 1860 

TGTGGTGTCA GTAAATnTCT ATCAAAATAC 1920 

CGTGACATTT CAATCACGTC GTCTAAAGAT 1930 

ACTTACCCGA TTAAAATCAA GTAAGTTTTA 2040 
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TATTTGATAA AAAATCAATA AGTAATTGTG 
GCGCGTCGAT ATACATATCA TACTGACCAC 
TTGTATATGT CTGCTTTAAA TCAACTGCGT 
GTCCCTTTGG TCTTCCAACA TGAATGGTAT 
AGTGTTGTGG TTTGGGTTCA AGGAAGTCTG 
CAAAATATTC TGCTGATCGT TCAATGGCTT 
CTTTAAATGT ATTTGGAAAT GGGTAATTGT 
TGAAACCACT AGCAGAATCA AACAAAGCTG 
ATAAAGCGTA ATTCATAAAA TTTGTAAAAG 
GATTAATCGT CATATCATAT GGCAATGTAG 
GCTTTCGTAA ATGTTGGTCA TCTTCATCAA 
GTAATTCACA TGATTCAACG GATAGATTTT 
CTACAGTTGT ACCTCTCGTA CCAGGTTGAA 
TTTGTCGATG TTGGTGACCC GTAATAAAGA 
TGGCATATCC TTCATTTTCA CCCGTTAATA 
TTTCAAATCC ACCATGGTAA CAAACCACAA 
AGTATTGTTG AAGTATTTCA AAAGCACTAT 
GTTCCCAATG GGGAATAAAT TGTGTCGTTA 
CCTGAAAATA CTTCACACCG TTATCAGTCA 
ACAAAACTGG ATAATTGAGT CTG CGTAAAG 
ATTCATGATT ACCAAGCGTA CCAAAGTCGA 
GCTGGCTACT GCCGCTATGC GCGATTAAGT 
CACCAT7ATC TATTTTAAAA CTTTGGTCAT 
TCGCTAGTAA CAATCCCATA GGTTGATATT 
TATAACCATG TACGTCACTC ACGACATAAA 
TCAATCACAA ACATCTTTCT TATTTCTATT 
GGTTTTGTCA CCGAGTTTTA AACGAATCTT 
ATTGACCTTA ATTGTGACAT TTCCGTTTTC 
ACCTGGTGGG TTATAATCGT TATCTTTACT 



CGCCTTCAAC TTGAATATCT TTTACAACTG 
CGCCTACTGC ACGATAATTA TTTACACAAA 
GACCTTGAAT CATCATATTG CTCACACGTT 
AACTTACGCC ACCATATATA TCATAATTAA 
CGCTCACACT AACTTCATCA TTTTTCACGT 
CTTTAAGTTT GGCACCACTT ACAGCTAAAA 
TAATAACATC TCGCATCGTC ACGACTTGCT 
TACAGGCAAC ATCTGCGTCA CTTTTTTCTA 
GATG CGGTGC CACACGTGCC TCAAATGCAT 
TAATTTCGTA ATCTAACCAG TCCTCTAACT 
TAGTAAATGT GGAATCATCT ATAACAGGAA 
CATATT CATC AGT ACT CAAG ACTACTCTGC 
TCACAGCCGT TTGCTTAAAC CTTTCAGCAA 
TATCTATATC TTTAGAAAAC GCTTCTAACA 
CTTCGGTCGG CGTACCACTT TCTAAATCCT 
TGATATCTGC ATGTCGCTTC ATTTCAGGTA 
GAAACGTArT GnCnTGAATA TGCTCTGGTT 
AACCTATCAC ACCAACAGTT TGATCTCCAA 
ATGTACTATC ATTTTCATAT ATATTAGCGC 
TGTCTTTTAA GTATGGTAAT CCATAATTAA 
ATGCCATTCG ATTATAAAAA TCAACTAAAG 
AATTACAAAA TGGTGACCCT TGCAAAAAAT 
ACTGCCTTCT GTsTTGTTCT ATAACATGAT 
GATTTCTACT CGTAAAATCT GTTGGGAAAA 
ATGCTATGTT TGACATCCTC ACTCACTCCT 
ATATATTTAT TTGAAGTCTG TTGTAATCAA 
TGAACCTTCC ATACTTTCAA GTACTTTAGC 
ATCTGCTTTA ACTGTTGGCA AAGTACTGTA 
TGAAAATTGT CCGATTTGAC GTCCGCCTTC 
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TATTGTCATT TCAAATGGCT CATTTACAGA AACATTTTGC GGGATATCAA ATGTTACTTT 3960 

TTCGTTCTGA TTTGGTGGTG TATGATCATC TGGTGTGTTT GGCTGAGOAT CTGCGCCTTT 4 02 0 

5 

TTCGCTGCCA TAACTACCTG CTTTAAATGT TGTTGGATCA TACCATTTAT AACCACTCGG 4080 

CGGTTGTGAC CATGGCTCTT TTTCAGGCTC AGTTGAACGC TCTGGTCGTT CAAAATCAAG 414 0 

CAACTTAGTC TTTGTATCTA ATGTTAGGCT ACTCGCCTTA AGTGATTTCC CATCATTATC 4200 

10 

TTTAGACATC CAAGCCGTTA TATTATTTAA TAGCTTACCG TTGTCTTGTT CTTTAAAACC 4260 

ATCATATGTT TTCTTCTTTT CTCCATTATC TTCTCTTACA TATTTGGGCG AACTATCTTC 4320 

75 CACAAGTGAT GAATCACCGA TAAATGCTGC TTTACCTTTT CCAACTTTAG AAATTGCTAC 4380 

ATAGGGGCCT TCTGCTTTAC CGCCCCCATT ATAAATACCT TGATCTACAG CATGTGACCA 4440 

TTTACTTTTC GCTGGCAATT GTTCTGGTGT ATACACAATA CCTTTTGCTT TCTCTGGATT 4500 

20 

AGTAATTGCT AATGTCGATC CGGCATGCAT AGAGACAGAT TTCACACCTT CAGTAATACC 4560 

GAAACTTTCT TTTGAAGAAA CAATATTGCT CGTATTTAAA TCACCTAGTG CATTATATCG 4620 

AAAACGTACG CCAAAGTTTG TAGATAACCA ATCTGAACTT TTCACACCTT GCATTGCAGT 4680 

25 

AGAACTTTTT TCTTCTGCAT TCATACCTTT CGACATATCT TCATATGCTC CACGTCGATA 474 0 

ACCATTCATT GCCTCCGATG AATCAATACG ATTTAAATTT CGGTCAGCAT TGTAATGATC 4800 

3Q TGAAATAAAG ACAACATTGC CACCTTGTTt CACATATTTA ACAATTGCTG CCTGTTCTGA 4860 

TTCTTTGAAA GGAATGTTAG CCTCAGGAAT TACAAATATT TTGGAACTTT TCAAACTTGC 4 920 

TTCTGTTATG TTCGAATGAC CATCAATAGC TTTAACGTCA TAACCTTGTT TTTGTATTGA 4980 

35 ATCCGCATAA TCTGAAAATG CACCATCACT AACCCAATCT GCAGCACCAG CTGTTTGACC 5040 

ATGAGAACGA TCGAATAATA CCGTTCGCTG TTGCTTTGTA GGTTGCGATT CATGCGTTAT 5100 

AGCTAAAGAT TGCGGTAAAG CACTTAATGA TACCGTTGCA ACAATTGCAG AGACAGTTAA 5160 

40 - 

TGACTTATAT ATTTTTTTCA TTTTGTGAGG CTCCTTTTAA AATAAATTTG TTCTTGAATT 5220 

ATAGGATAAA AATTCGTTGC ATATGAGCAA TTTAACGAAA AATTTACAAA ATCTTATCAA 5280 

ACTCTTAAAG AAAGTTATTA AAATTCATTT TTATAAAATA CTTTTTAACA TTTAAATGTG 534 0 

45 

GTACGCTATA AGTGTAATTT CATTGCATAC ATATTACACG ATTAAGAATG TGAAGGGGAC 5400 

AGTTATCAAA TGAAAAATTT TAAGTGTTTA TTTGTATTAA TGTTAGCAGT CATTGTTTTT 54 60 

so GCAGCAGCAT GTGGAAACTC AAGTTCTTTA GATAATCAAA AGAACGCTAG TAATGATTCG 5520 

GATTCTAAAT CAGGAGGATA CAAACCTAAA GAATTAACCG TTCAATTTGT ACCTTCGCAA 558 0 

AATGCTGGAA CATTAGAAGC TAAAGCAAAA CCATTAGAAA AATTACTATC TAAAGAATTA 5640 
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TCTAAAAAAG TTGATGTTGG TTTCTTACCA CCAACGGCAT ACACATTAGC ACATGATCAA 5760 

AAAGCAGCTG ATTTATTATT ACAAGCACAA CGTTTCGGTG TAAAAGAAGA TGGTTCAGCA 5820 

AGTAAAGAAC TTGTAGATAG TTATAAATCA GAAATTCTTG TTAAAAAAGA CTCAAAAATT 5380 

AAAAGCTTGA AAGATTTAAA AGGTAAGAAA ATTGCCTTAC AAGATGTAAC ATCAACTGCT 5940 

GGATATACAT TCCCACTTGC GATGTTAAAA AACGAAGCAG GTATTAATGC AACTAAAGAT 6000 

ATGAAAATTG TGAATGTTAA AGGTCATGAC CAAGCAGTTA TCTCATTATT AAATGGAGAt 6060 

GTAGATGCTG CGGCTGTATT TAACGATGCA CGTAATACTG TGAAAAAAGA CCAACCAAAT 6120 

r5 GTATTTAAAG ACACACGAAT TTTAAAATTA ACACAAGCTA TTCCGAATGA CACAATTTCT 6180 

GTAAGACCAG ATATGGATAA AGATTTTCAA GAAAAATTGA AAAAAGCTTT TATAGACATT 6240 

GCTAAATCAA AAGAAGGTCA CAAAATTATT AGCGAAGTTT ATTCACATGA AGGATACACA 6300 

GAAACGAAAG ATTCAAATTT CGACATTGTA AGAGAGTACG AAAAATTAGT TAAAGATATG 6360 

AAATAATCAT TATTTAACAA ATGAATCATT AGCGAATTTG GTATTAAAAG CTTTCGTTCA 6420 

ATAGATATAT TCTAGATTAA TATTGAAAAG CTAGGCGCTA AACTGAAACA GATATAGAAA 64 80 

GGTGTCGCTG TACATTTGAA ACCATTTGTA CACAGAAACC CAATGTCTAT GATATTTCAG 654 0 

TTTACCTTGG CTTTTCTTTA TTAAAGAAAG GTGTCAAACA TGAGTCAAAT CGAATTTAAA 6600 

AACGTCAGTA AAGTCTATCC TAACGGTCAT GTAGGCTTGA AAAATATTAA CTTAAATATT 6660 

GAAAAAGGTG AATTTGCAGT TATTGTCGGA CTATCTGGTG CTGGGAAATC CACGTTATTA 6720 

AGATCTGTAA ATCGTTTGCA TGATATCACG TCAGGTGAAA TTTTCATCCA AGGTAAATCA 6780 

35 ATCACTAAAG CCCATGGTAA AGCATTATTA GAAATGCGCC GAAATATAGG TATGATTTTC 6840 

CAACATTTTA ATTTAGTTAA ACGGTCAAGT GTATTACGAA ATGTACTAAG TGGACGTGTA 6900 

GGTTATCACC CTACTTGGAA AATGGTATTA GGTTTATTCC CAAAAGAAGA CAAAATTAAG 6960 

» 

GCAATGGATG CACTAGAACG CGTCAATATC TTAGATAAAT ATAATCAACG CTCTGATGAA 7020 

TTATCAGGTG GCCAACAACA ACGTATATCT ATTGCACGTG CGCTATGCCA AGAATCTGAA 7080 

ATTATTCTTG CAGATGAACC AGTTGCTTCA TTAGACCCAT TAACTACGAA ACAGGTTATG 7140 

GATGATTTAA GAAAAATCAA CCAAGAATTA GGCATCACAA TTTTAATTAA TTTACATTTT 7200 

GTTGACTTGG CAAAAGAATA TGGCACACGC ATCATTGGTT TACGTGATGG TGAAGTTGTC 7260 

50 TATGATGGTC CTGCATCTGA AGCAACAGAT GACGTATTTA GTGAAATATA TGGACGTACA 7320 

ATTAAAGAAG ATGAAAAGCT AGGAGTGAAC TAACATGCCT TTAGAAATAC CTACAAAGTA 7380 

TGACTCCCTT TTAAAGAAAA AGGTTTCTTT AAAAACGAGT TTTACCTTCA TGTTAATCAT 7440 
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AATACCTCAA ATAGGTGATC TATTCAAACA AATGATTCCA CCTGATTTCG AGTATTTACA 
ACAAATTACA ACGCCAATGT TAGATACCAT TCGAATGGcT ATCGTAAGTA CAGTATTAGG 
TAGCATCGTT TCAATACCAA TTGCGTTATT ATGTGCTAGC AATATCGTTC ATCAAAAGTG 
GATTTCAATA CCCTCGCGCT TTATTTTAAA TATAGTTCGT ACTATTCCAG ATTTGTTATT 
AGCAGCAATC TTTGTGGCTG TATTTGGAAT CGGTCAAATT CCAGGGATAT TAGCACTGTT 
TATTTTAACT ATCTGTATTA TTGGAAAATT ATTATATGAA TCATTGGAAA CGATAGATCC 
AGGTCCAATG GAAGCAATGA CGGCTGTTGG CGCTAATAAA ATAAAATGGA TTGTTTTCGG 
TGTTGTACCA CAAGCCATAT CGTCATTTAT GTCATACGTA TTATATGCAT TTGAAGTAAA 
TATACGTGCT TCAGCTGTGC TTGGATTAGT CGGCGCTGGC GGTATTGGAT TGTTTTATGA 
TCAAACACTT GGTTTATTTC AATATCCAAA AACAGCAACG ATTATTTTAT TTACTTTAGT 
TATCGTCGTC GTCATTGATT ACATCAGTAC GAAAGTGAGG GCACATCTCG CATGACACAG 
GAAATAGCAA AATATAATGT TCACACAAAA GCACACAAAC GAAAATTGAT TAAAAGATGG 
CTTATTGCAA TTGTCGTCTT AGCTATTATC ATCTGGGCAT TTGCAGGTGT ACCAAGTTTA 
GAACTTAAAA GTAAATCATT AGAAATCTTA AAATCCATAT TCAGCGGATT ATT C CATC CT 
GATATCAGCT ATATCTATAT ACCAGATGGC GAAGACTTAT TACGTGGTTT ACTTGAAACC 
TTTGCGATAG CCGTTGTAGG TACTTTCATC GCCGCAATTA TCTGTATTCC ATTAGCATTT 
CTAGGTGCAA ATAATATGGT AAAGCTACGC CCAGTTTCAG GTGTTAGCAA ATTTATTTTA 
AGTGTTATAC GTGTCTTCCC AGAAATTGTA ATGGCACTTA TATTTATCAA AGCTGTTGGC 
CCAGGTTCAT TTTCAGGTGT ATTAGCTTTA GGTATCCATT CCGTAGCATG CTTGGGAAAC 
TTTTAGCTGA AGATATTGAA GGTCTAGATT TCAGTGCTGT AGAATCATTA AAGGCCAGTG 
GTGCDAATAA GATTAAAACA CTCGTATTTG CAGTCATACC ACAAATTATG CCTGCCTTTC 
TATCACTCAT ACTTTATCGC TTTGAACTAA ACTTACGTTC AGCTTCTATA CTGGGGCTAA 
TTGGGGCTGG TGGTATCGGG ACACCACTCA TATTTGCCAT TCAAACACGT TCTTGGGACC 
GTGTAGGTAT TATATTAATC GGTTTAGTAC TAATGGTCGC AATTGTCGAT TTAATTTCCG 
GTTCAATCCG AAAACGTATT GTTTAACATT AAATCAGGAT ACTCCTAAAT AAGAAGTCCT 
ACCGTCTTAC GTTTCTCTAT TATAATAAAA ACAGCAGTGA AGAAAACTAT TGTTATAGTT 
AACTTCACTG CTGTTTTTAT AATATCTAAA TTTATTCTAT TTCAATTCCT TTAAATAACT 
TTTACCGAAC TCTGGTAATG TTACGTTGAA ATTATCTGCT ATAGTTGCAC CGATAGAACT 
GAATGTAGTA TCACTTTCTA GTGCATGACC ACCTTTAAAT TTCGGACTGT ACATAATTAC 
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TGTAATAATT ACTAAATCGT CTTCTTTTAA 
GAAATCTTTA ATTGCTTGTG CATAACCTGG 
5 AAAGTCTACT AAGTTTAAGA AGCTAATACC 

TTGATCCATA CCGTCCATGT TACTCTTCGT 
ATAAATGTCA TTAATTTTAC CGATGGCAAT 

10 

TAAGACAGTT TTACCAAAAG GTTTTAACGC 
GTTTCCTGGT TCACCAACAT ATGGACGTGC 
TTTTGTCAAC TCACGAACCT TTTCACAAAT 

15 

TTCATGTGCA GCAATTTGCA ATACTGGGTC 
TTTCATTTGG TGCTCGCCCC ACTCATCGAT 
20 AACAACTTTA CGACCTGTCA TTTCTTCAAT 

AGGGTATACT TTAAAAGGTT G CAT AAT ATT 

TGTATCTTTA CCAACTGAAG CTTCACTCAA 

25 

TGCATTTACT ACTGGTAATT TAT CG ATGTT 

AGTTTGATCG AAACCTTCTA AGGTATGTCT 

AGCTGCGTCT GGCGCTTCAC CAATACCTAC 

30 

AAATGGTCTT GTCATAGCTA TCACTCCCAA 
TTCTAAACCT TG CAT AATTT GAACACCTGC 
AACCATTTTA TTGAAATCTT CTAAATTACG 

35 

AGCACCTACT GTATCTTTCA TTAATTTAAC 
ACCTGTTGAA GTTTTAACGA AGTCCGCACC 
40 AATTTCGTCA TGGTCCAACA ATACCGTCTC 

AGCTTTAACC ACTGCTTCAA TGTCTTGTTG 
GCCGATGTTG ATGACCATGT CAATTTCATC 

45 

AAATGCTTTC GTTGCAGTTG TCGACGCACC 
CACCTCTGAA TCAGCTAGTC GCTCTGCTGC 
AGATTTAAAA TTGTATGctT TCGCTTCATC 

50 

CTCAGGCTTC AATAAAGTGT GATCTATATA 
TGTTATATAA TCTCTTTATT TAATTTTACT 

55 



GTTGCTAAAC AGTTCTGGCA AGCGATCATC 93 60 

TTTATCACGA CGATGACCGT ATAATGCATC 9420 

TGTGaAATCT TTCTTAACAA TTTTCATCAA 94 80 

ACGAACCGCT TCTGTTACAC CTTCACCATC 9540 

AACATCATAA CCACCGTCTT TCAAATGATC 9S00 

ATAGTCATGT CGATTAGATG TACGTGTAAA 9660 

GATAATACGA CCAATTAAAT ATTTAGGGTC 9720 

ATCATATAAC TCTTCTAATG GGATAATGTC 9780 

TGCACTTGTA TAAACAATTA AGTCACCAGT 984 0 

AATTTGCGTA CCCGATGCCG GTTTGTTAGC 9900 

TTGTTGAATT AACTCTTCAG GGAATCCATT 9960 

TAATCCCATA ATTTCCCAGT GACCAGTCAT 10020 

TTTAGTATAG TATGCTTCTG GTTGTTCAAC 10080 

CCCTAGACCT AACTTTTCAA GGTTTGGTAA 10140 

TAAAGTATGT GAACCTTCAT CTTTAAAATC 10200 

TGAATCCATT ACGATTAAAT GTACACGATT 10260 

AATTTATATA TATTAGTAAT CTGAATCTGC 103 20 

GCTCGCACCA ATACGTGTCG CACCTGCTTC 103 80 

TACGCCACCT GATGCTTTTA CTTCTACATC 10440 

GTCTTCTGCA GTCGCACCGC CACCTGCAAA 1050 0 

AGCCGCTTTT GTTAATTCAC TCGCTTTTAC 10560 

AAT AAT CACT TTTACTGTGT GACCTTT CGC 10620 

TACATCATCA AAACGTCCAT CTTTTAATGC 10680 

TGCACCATTT TGAATTGCAT CTTCTGTTTC 10740 

TAATGGGAAT CCTATTACCG TACAAACGAG 10800 

ATATTTAACA TGTGTTGGAT TCACACATAC 10860 

GATGAT7TGA TCGATTTGCG TACGTGTTGA 10920 

TTTCTCAAAT TTCATACTTA CTACTCCTCG 10980 

ATAAATACGA ATATATCTCG CGAATTTATA 11040 
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ATACTCATTA AACCTAAAAT AATTAAAATA ATACCGAAAT GTGAACTTAA TGCATCATTG 11160 
CCTGGGAAAT TTAATGCTTT AAAATCGATT AGAGCCGCAG CAATCGCAAT ACCTACAGAT 11220 

5 

ACCGCCACAT TAATAATTAA ATTATAAAAA CCAATAGCCA CACCTGTCAT ATTAAGATCT 11280 
ATTGTTTTAA TGGCTTCGTT AAGTAAAGGT GCATACATTA AAGCAAAGCT ACCTGCAAAG 1134 0 
AATATCATAG AAATGACGAA GATTGAAATG TGATTACCTA CTGCAAATGC AGGTAAAATC 11400 

10 

AAGCTCAGTG CTATTAAAAT AATTGCTGTG ATAATCGCTT GTTTTGAATT CAGATATTCG 11460 
CCGATTTTAC CACTTAGTGC ACCAACAATG ACTGCTACTA TATAACCCGG TACTAATAAC 11520 

75 AGTGATGTTG TGTCTAGTTG CAGATGATAA ATTTGCTCCA TTATGAATGG GAACGTAAAA 11580 

ATATAACCCA ATTGGATAGC ATACATTACA AATACTATAA ATAAAAATGA AGCATAACGT 11640 
TTATTTTGGA AAAATGATTT ATTTACTAAT GGACGTTGCG CATTTTTAAT ATATAGCGCA 11700 

20 - AAAACGATAA TCGCAATTAA GGCACCAATC ATATATAACC AATTAAAGTT CGTAATAAAC 11760 

AGCATGACTG TTGTAGCAGG GGATCCTCTA GAGTCGAnCC TG 11 SO 2 

(2) INFORMATION FOR SEQ ID NO: 71: 

25 

. (i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1196 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



35 


CTAAAGAAGA 


TGCGAAACAA 


GATGTTGATA 


AACAAGTTCA AGCTTTAATT 


GACGAAATCG 


60 




ATCAAAATCC 


AAATCTAACA 


GATAAGGAAA 


AACAAGCACT TAAAGATCGT 


ATTAATCAAA 


120 




TAC^CAACA 

• 


AGGTCATAAC 


GACATTAACA 


ATGCGATGAC AAAAGAAGCA 


ATTGAACAAG 


180 


40 


CAAAAGAACG 


TTTAGCGCAA 


gCATTGCAAG 


ACATCAAAGA TTTAGTGAAA 


GCTAAAGAAG 


240 




ATGCGAAAAA 


TGATATTGAT 


AAACGTGTAC 


AAGLTITAAT TGACGAAATC 


GATCAAAATC 


300 


45 


CAAATCTAAC 


AGATAAGGAA 


AAACAAGCAC 


TTAAAGATCG AATTAATCAA ATACTTCAAC 


360 


AAGGTCATAA 


CGACATTAAC 


AATGCGCTGA 


CTAAAGAAGA AATTGAGCAG 


GCAAAAGCAC 


420 




AACTTGCACA 


AGCATTGCAA 


GACATCAAAG 


ATTTAGTGAA AGCTAAAGAA GATGCGAAAA 


480 


50 


ATGCAATAAA 


AGCCTTAGCT 


AATGCGAAgc 


GTGATCAAAT CAATTCAAAT 


CCAGATTTAA 


540 




CACCTGAGCA 


AAAAGCAAAA 


GCGCTCAAAG 


AAATTGACGA AGCTGAAAAA 


CGAGCACTAC 


600 




AAAACGTTGA 


GAATGCTCAA 


ACTATAGATC 


AATTAAATCG AGGATTAAAC 


TTAGGTTTAG 


660 
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TTGAAGCAAC ACCTGAGCAA ATCCTAGTTA ATGGTGAACT CATTGTACAT CGTGATGACA 780 

TCATTACAGA ACAAGATATT CTTGCACACA TAAACTTAAT TGATCAGCTT TCAGCAGAAG 84 0 

TCATCGATAC ACCATCAACT GCAACGATTT CTGATAGCTT AACAGCAAAA GTTGAAGTTA 900 

CATTGCTTGA TGGATCAAAA GTGATTGTTA ATGTTCCTGT AAAAGTTGTA GAAAAAGAAT 960 

TGTCAGTAGT CAAACAACAG GCAATTGAaT CAATCGAAAA TGCGGCACAA CAAAAGATTA 1020 

ATGAAATCAA TAATAGTGTG ACATTAACAC TGGAACAAAA AGAAGCTGCA ATTGCGnAAG 1080 

TTAATAAGCT TAAACAACAA GCAATTGGAT CATGTTnAAC AATGGCACCT GGATGTTCCA 1140 

TTCAGTTGAA GGAAATTTCA ACAACAAGGA ACAAGCGCCn GATTGGAACA ATTTGA 1196 
(2) INFORMATION FOR SEQ ID NO: 72: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 
20 <B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



• (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



CAATCGTTTC 


AACGCTATTA 


TCTTTAGACA 


ACAATTGTAA 


GCGTGTATGT 


GCAGTTTCTA 


60 


AACAGTCTAT 


AATTCGAGTT 


CTTAATTCAG 


CTGGATCATC 


TTTAAAAATA 


AAATCCATCG 


120 


CTGCAACTTT 


GTAGACAAAT 


GTTAAATAGG 


TAAGTTCACT 


GTGACTCGTA 


ACGAAAATAA 


180 


TGTTACCAAC 


TGGGTCATGC 


TTACGAATTT 


CACTGCCTAA 


TTTGATACCA 


TTAATATCAG 


240 


TTGAAAGTTG 


AATATCTAAA 


AAGTAACAGC 


CTATGTCATT 


CATATTTTTA 


GCTTGCTCAA 


300 


GCACCTCATA 


AGGATTATCA 


GTTGCGAGGG 


CAATTTCCAT 


AGGCTTTTCT 


TCTATCATTA 


360 


TATAATTTTT 

* 


AATAATGGTA 


ACCATGTTTT 


CTCTTTGTTT 


TGGATCGTCT 


TCGCAAATGA 


420 


AAATTTTCAT 


ACATTCACAT 


CCTTATGGCT 


AGTTGTTAAT 


AATTTCAACT 


TTTTGAATAA 


480 


AGAAACCATT 


TTCGATAATT 


GTATCTAATA 


AGACATTGTC 


TGCATTATCA 


GCAATTTCTT 


540 


TTAAAGTTGA 


TAGACCTAAA 


CCACGACCTT 


CACCTTTAGT 


AGAAAAACTT 


TCTTGGAACA 


600 


ATTCATGAAT 


GCGTGGTATA 


TCATCAGCGC 


ATTTATTCAT 


AACAATAAAC 


GTTACTGAAT 


660 


TTTCACTTTC 


AATAAATGCA 


ACGCGAATGA 


TAGGGTCATC 


AATTTCAGTT 


GATGCCTCAA 


720 


TTGCATTATC 


AAGAATAATA 


CCAATACTGC 


GACTTAAATC 


GATCATATTC 


AAGTTAATGC 


730 


TACTTACTTC 


ATCGGGTATT 


TCGATACTAA 


TCGGAATATT 


CATTTCTTGT 


GCACGTAAAA 


840 


TTTTCGCAGT 


AATTAAGCCT 


TTAATTTCAC 


GTACTTTAAG 


ATTCTCGATA 


CCATTTAATT 


900 
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GTAGCCCAGG CATGTCATCT TCTCGAATGT ATTCTGAAAG TGTCGTTAAG ATATTGACAT 
AATCATGACG GAACTTGCGC ATTTCGTTGT TGATAGCTTC AATCTTCAAT GTATATTCAT 
AATAGGTTTC AATTTCTTCT TGATTACGTT TATATTTCAT CTCTTTAAGG AGAAATTGAG 
AAATAACAAA TGTTAATATA CTTAAAAATA TAGTGATACC AATAAAAATA AAAGAATACT 
GCCTTATTAC TTTAGCTTCA TCCGAGTTTA TTTGTGAATA AAAGAAAAAT AATGAAAAAG 
TAAGCAGTAA GATAGTCGAA ATAACTATTA AAAATCCTTT GTTTAGTATT AGATATGGTG 
TGCTAATTTT TTTGAGAACT CTATTTATTA TATATGAGAA TAGTATACTA ATAGTCACAT 
AAACTACAAA AAAGCTAGGG AATATTACAA ATATACTATC AGAAATTTTG GTGGATATAT 
GCATATATAA CTATATACCT GTAGTTAGCA CnGTnATAGG AATAATCnGG CGAGGTCCAT 
AATCCACCAA AATAGAATA 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



GTAGGAATCT 


CTTTGTCTTT 


TTGGGAGGAC 


ATTTAATATG 


AATGTATATT 


TAGCAGAATT 


CCTAGGAACT 


GCAATCTTAA 


TCCTTTTTGG 


TGGTGGCGTT 


TGTGCCAATG 


TCAATTTAAA 


GAGAAGTGCT 


GCGAATGGTG 


CTGATTGGAT 


TGTCATCACA 


GCTGGATGGG 


GATTAGCGGT 


TACAATGGGT 


GTGTTTGCTG 


TCGGTCAATT 


CTCAGGTGCA 


CATTTAAACC 


CAGCGGTGTC 


TTTAGCTCTT 

* 


GCATTAGACG 


GAAGTTTTGA 


TTGGTCATTA 


GTTCCTGGTT 


ATATTGTTGC 


TCAAATGTTA 


GGTGCAATTG 


TCGGAGCAAC 


AATTGTATGG 


TTAATGTACT 


TGCCACATTG 


GAAAGCGACA 


GAAGAAGCTG 


GCGCGAAATT 


AGGTGTTTTC 


TCTACAGCAC 


CGGCTATTAA 


GAATTAcrrr 


GCCAACTTTT 


TAAGTGAGAT 


TATCGGAACA 


ATGGCATTAA 


CTTTAGGTAT 


TTTATTTATC 


GGTGTAAACA 


AAATTGCCGA 


TGGTTTAAAT 


CCTTTAATTG 


TCGGAGCATT 


AATTGTTGCA 


ATCGGATTAA 


GTTTAGGCGG 


TGCTACTGGT 


TATGCAATCA 


ACCCAGCACG 


TGATTTAGGT 


CCGAGAATTG 


CACATGCGAT 


TTTACCAATA 


GCTGGTAAAG 


GTGGTTCAAA 


TTGGTCATAT 


GCAATCGTTC 


CTATCTTAGG 


ACCAATTGCC 


GGTGGTTTAT 


TAGGTGCAGT 


GGTATACGCT 


GTATTTTATA 


AACATACATT 


TAATATTGGT 


TGTGCAATTG 


CrATTGTTGT 
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CGAATCAATT TACTAAAATA AAAAGAAACG TAAATAGCAT AATTTAACAT GTTTGATTCA 900 

TGGATTATGC TATTTTTTCG CCAAAATTTA ACAGATTTTG TACAATGGGT TAGCGATTAT 960 

TTTTTAATAA AGGAGATACT ACTAATGGAA AAATATATTT TATCTATAGA CCAAGGAACA 1020 

ACAAGCTCAA GAGCGATTTT ATTCAATCAA AAAGGGGAAA TTGCAGGGGT AGCACAACGT 10 SO 

GAGTTTAAGC AATATTTTCC ACAA7CAGGT TGGGTTGAAC ATGATGCAAA TGAAATTTGG 1140 

ACATCTGTGT TAGCTGTAAT GACGGAAGTA ATTAATGAAA ATGATGTTAG AGCTGATCAA 1200 

ATTGCAGGTA TCGGTATTAC AAACCAACGT GAAACAACGG TTGTTTGGGA CAAaCATACT 1260 

GGCCGCCCAA TTTATCACGC AATTGTTTGG CAATCACGTC AAACACAATC AATTTGTTCA 1320 

GAATTAAAAC AACAAGGATA TGAACAAACA TTTAGAGATA AGACAGGATT ACTTTTAGAT 1380 

CCGTATTTTG CAGGTACAAA AGTTAAATGG ATTCTAGACA ATGTTGAAGG TGCACGAGAA 1440 

20 AAAGCAGAAA ATGGCGATCT ATTATTTGGA ACGATTGATA CTTGGTTAGT ATGGAAATTA 1500 

TCaGGaAAAg CtGCGCATAT TACTGATTAT TCaAATGCGA GTCGTACATT AATGTTTAAT 1560 

ATCCATGATT TAGAATGGGA CGATGAGTTA TTAGAACTAt TACAGTACCT AAAAATATGT 1620 

2 s TGCCAGAAGT TAAAGCTTCG AGTGAAGTAT ATGGTAAGAC AATTGATTAC CACTTCTATG 1680 

GTCAAGAAGT ACCAATCGCT GGAGTAGCTG GTGATCAACA AGCAGCATTA TTTGGACAAG 1740 

CTTGCTTCGA ACGTGGTGAC GTGAAAAACA CATATGGAAC TGGTGGCTTC ATGTTAATGA 1800 

ATACAGGTGA CAAAGCGGTT AAATCTGAAA GTGGTTTATT AACAACAATT GCTTATGGTA 1860 

TTGATGGAAA AGTAAATTAT GCGCTTGAAG GTTCCATCTT TGTTTCGGGT TCAGCAATCC 1920 

AATGGTTACG TGATGGATTA AGAATGATTA ATTCAGCACC ACAATCAGAA AGTTATGCGA 1980 

CACGAGTTGA CTCTACTGAG GGTGTTTATG TTGTTCCAGC TTTTGTAGGT TTAGGAACAC 2040 

CATATTGGGA TTCTGAAGCA CGTGGTGCGA TTTTCGGTTT AACACGTGGA ACTGAAAAAG 2100 

AGCACTTTAT CCGTGCAACT TTAGAATCAC TATGTTACCA AACTCGTGAC GTTATGGAAG 2160 

CAATGTCAAA AGACTCTGGT ATTGATGTCC AAAGTTTACG TGTCGATGGT GGTGCAGTTA 2220 

AAAATAACTT TATTATGCAG TTCCAAGCAG ACATTGTTAA TACTTCTGTT GAAAGACCTG 2280 

45 AAATTCAAGA AACTACAGCT TTAGGTGCTG CATTTTTGGC AGGTTTAGCA GTTGGATTCT 2340 

GGGAGAGTAA AGATGATATC GCTAAAAACT GGAAATTAGA AGAAAAATTC GATCCGAAAA 2400 

TGGATGAAGG CGAAAGAGAA AAATTATATA GAGGTTGGAA AAAAGCTGTT GAAGCAACAC 2460 

AAGTTTTTAA AACAGAATAA ACTTGTAGAT TAGACTTTTG TATAAACATT GTGATACAAT 2520 

CAATTTAAGT TAATATTTGA ATCGAGAAGC GAGAGATTTG TTCGAACATG TACAATTGAA 2580 
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GTTATTAAAG GTGTGAGATG ATGACTGAAA AACAATTTAA ATTAACTGTA CAAGATAATA 4500 

CGAATATTGA AGTTAAAGTG AATTTTACAG ATGTAGATTC AAAAGGAATT ATTCATATAT 4560 

TTCATGGTAT GGCTGAACAT ATGGAACGTT ACGATAAATT AGCACATGCA CTTTCAAAGC 4 62 0 

ATGGCTTCGA TGTGATACGT CATAATCATC GAGGACATGG TATTAATATT GATGAATCAA 4680 

CAAGAGGGCA TTACGATGAT ATGAAACGAG TTATCGGTGA TGCCTTTGAA GTAGCGCAAA 4 740 

CAGTGAGAGG CAATGTTGAT AAACCATACA TTATAATCGG ACATTCAATG GGATCCGTTA 4300 

TAGCTAGATT GTTTGTAGAA ACATATCCGC AATATGTTGA TGGTCTAATT TTAAGTGGTA 4 860 

CTGGTATGTA TTCATTATGG AAAGGTTTAC CAACCGTTAA AGTGTTACAA CTGATTACAA 4920 

AAATTTATGG TGCTGAGAAA CGAGTTGAAT GGGTTAACCA GTTAGTATCA AATAGTTTTA 4 980 

ATAAAAnnAT ACGTCCATTA CGTACACAAA GTGATTGGAT TTCTAGTAAT CCAATTGAAG 5040 

TAGATAaCTT TATTAAAGAT CCATATAGTG GaTTTAATGT GTCAAATCAA TTATTATATC 5100 

AAACAGCCTA TTATATGCTA CATACATCAC AATTAAAAAA TATGAAAATG TTAAaTCATG 5160 

CCATGCCTAT ATTATTAGTT TCAGGATATG ACGATCCTTT AGGTGATTAT GGTAAAGGGA 5220 

25 TTTTAAAATT GGCGAATATA TATAGAAACG CTGGCATnAA AAATGTTAAA GTGAATCTTT 5280 

ATCATCATAA ACGTCATGAA GTGTTATTTG AAAAnGATCA TGACnAAATT TGGGAAGACT 534 0 

TGTTTAAATG GTTGAATCAA TTTTATAAAA AATAAAGAAA GTGGAATTAA ATATGAATAA 5400 

AAATAAGCCT TTTATTGTAG TAATTGTGGG GCCAACTGCT TGCAG 5445 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2569 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
~ (D) TOPOLOGY: linear 

* 

40 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TGGCTTGAAC TACGCCAATA AGTCCCCCTA GTACAAGAAT GAATACCATG ATATCGACCG 60 

45 CTTCTATCGT ACCTTCAACC ATGCTACTTG TTATTTGTTC TGGTCCAGCT GGATGTTGCT 120 

TTAATCTTTC ATAAGTATTC GGAATTGATA CCGGCTTATT AATTGCACCT GATTTAAATT 180 

GTTCAATCTT AATTTTAACC CCCATTTTGT CTAGTTCCTG TTGCGTACCC GGAACCTTTT 24 0 

SO 

TCACTTGGTT ATGAGGGTTA ACTATCTTTA GTTCTTGGGA TGAAGGTTCG TAAGAAAGTT 300 

TAGAATATGC ACCAGCAGGA ATAACCCATG TTGCTATAAC TGCAACAACC GTTAAAATGA 3 60 
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TTCTACTGTT CTTTGTGAAA AACCACGGTA TTCAATGCCA TCATACATTC CACCAAGCAC 2280 

ACGTGCAGTA TCTTTAGTTG TTTCTTTTTT ACCCATTTGT GATCCAGTTG GGCCTAAATA 2340 

AGTTACATTT GCACCTTGAT CATGCGCTGC AACTTCAAAT GCACATCGCG TTCTTGTAGA 2400 

ATCTTTTTCA AATAACAGTG CAATATTTTT ATTTTTTAAC ATAGGCTTTT CAGTGCCAAT 2460 

ATATTTAGCA CGTTTTAAAT CCTCGGAGAG TGTTAATAAG GTTCTACCTC TTGTCGTGAA 2520 

AAGTCTAATA AAGTTAAAAA ACTTCTGTTT CGTAnATTTT TCATTAAnA 2569 

(2) INFORMATION FOR SEQ ID NO: 75 r 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) topology: linear 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO; 73: 

CCTGGAACCA TCCaATCGtG CaAATCtTGa AAGaGAATAC GCAACAACAA TTAAATGTAT 60 

25 TGGAACACTA TATTCCAAAT GACCATC CAG CACTCGTTGA ATTAAAAATA TGGGAACGTT 120 

GGTTACATAA ACAAGGTTAC AAAGACATCC ATTTAGATAT TACTGCGCAC CACCTAGATC 180 

CTATTACACA GGTTTATTTA TTCAATGTCA TTTTGCTGAA AATGAATCTC GAGTTTTAAC 24 0 

AGGTGGTTAT TACAAAGGAA GCATCGAAGG GTTTGGATTA GGATTAACAC TTTAAGTAAG 300 

GGAGTATGCA CAATGTTAAG AATCGCCATA GCCAAAGOAC GTCTAATGGA TAGTTTAATT 360 

AACTATTTAG ATGTAATTGA ATATACGACA TTATCAGAAA CATTAAAAAA TAGAGAACGC 420 

CAATTATTAT TAAGTGTAGA TAATATTGAA TGCATTTTAG TAAAAGGAAG TGACGTGCCA 480 

ATCTATGTGG AACAAGGAAT GGCAGACATA GGCATTGTTG GTAGCGACAT ATTAGATGAG 54 0 

CGCCAATATA ATGTTAATAA TTTGTTGAAT ATGCCTTTTG GAGCATGTCA TTTTGCGGTT 600 

GCAGCGAAAC CTGAAACGAC CAATTATCGT AAAATCGCAA CGAGTTATGT TCATACTGCT 660 

GAAACATATT TTAAATCAAA AGGTATTGAT GTCGAATTGA TTAAATTGAA TGGCTCTGTT 720 

45 GAATTGGCCT GTGTTGTAGA TATGGTAGAC GGAATTGTCG ACATCGTTCA AACAGGTACT 780 

ACGCTAAAAG CGAACGGACT GGTTGAAAAG CAACATATTA GTGATATCAA TGCAAGATTA 840 

ATAACTAATA AAGCAGCTTA TTTTAAAAAA TCACAATTAA TAGAGCAATT TATTCGCTCT 900 

TTGGAGGTGT CTATTGCCAA TGCTTAATGC ACAACAATTT TTAAATCAAT TTTCATTAGA 960 

AGCACCATTA GATGAGTCAT TGTATCCaAT TATTCGCGAT ATTTGTCAGG AAGTTAAAGT 1020 
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TTTAGaAATT AGTCATGAmC AAATTAAAGC AGCATTTGAC ACATTAGATG AAAAAACAAA 1140 

ACAAGCATTA CAACAAAGTT ATGAAAGAAT TAnAGCATAT CAaGAAaGTA TtaAACAGaC 1200 

5 GaATCAACAG TTAGAAGaAT CAGTGGaGTG tTrTGaAATA TACCATCCmC taGaAAGTGT 1260 

CGGTATTTAT GTG 1273 



(2) INFORMATION FOR SEQ ID NO: 76: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



20 


GTTGATAAAT 


TAAAAATGTT 


TTTATCAGAT 


ATTCAAAGTT 


ACCAACAATA 


TAGTAAAGAT 


60 




CATCCGGTGT 


ATCAGTTAAT 


TGATAAATTT 


TATAATGATC 


ATTATGTTAT 


TCAATACTTT 


120 




AGTGGACTTA 


TTGGTGGACG 


TGGACGACGT 


GCAAATCTTT 


ATGGTTTATT 


TAATAAAGCT 


180 


25 


ATCGAGTTTG 


AGAATTCAAG 


TTTTAGAGGT 


TTATATCAAT 


TTATTCGTTT 


TATCGATGAA 


240 




TTGATTGAAA 


GAGGCAAAGA 


TTTTGGTGAG 


GAAAATGTAG 


TTGGTCCAAA 


CGATAATGTC 


300 




GTTAGAATGA 


TGACAATTCA 


TAGTAGTAAA 


GGTCTAGAGT 


TTCCATTTGT 


CATTTATTCT 


360 


30 


GGATTGTCAA 


AAGATTTTAA 


TAAACGTGAT 


TTGAAACAAC 


CAGTTATTTT 


AAATCAGCAA 


420 




TTTGGTCTCG 


GAATGGATTA 


TTTTGATGTG 


GATAAAGAAA 


TGGCATTTCC 


ATCTTTAGCT 


480 


35 


TCGGTTGCAT 


ATAGAGCTGT 


TGCCGArAAA GAACTTGTGT 


CAGAAGAAAT 


GCGATTAGTC 


540 


TATGTAGCAT 


TAACAAGAGC 


GAAAGAACAA 


CnTATTTAA 


TTGGTAGAGT 


GAAAAATGAT 


600 




AAATCATTAC 

• 


TAGAACTAGA 


GCAATTGTCT ATTTCTGGTG AGCACATTGC 


TGTCAATGAA 


660 


40 


CGATTAACTT 


CACCAAATCC 


GTTCCATCTT 


ATTTATAGTA 


TTTTATCTAA 


ACATCAATCT 


720 




GCGTCAATTC 


CAGATGATTT 


AAAATTTGAA 


AAAGATATAG 


CACAAATTGA 


AGATAGTAGT 


780 




CGTCCGAATG 


TAAATATTTC 


AATTGTGTAC 


TTTGAAGATG 


TGTCTACAGA AACCATITTA 


840 


45 


GATAATGATG 


AATATCGTTC 


GGTTAATCAA 


TTAGAAACTA 


TGCAAAATGG 


TAATGAAGAT 


900 




GTTAAAGCAC 


AAATTAAACA 


CCAACTTGAT 


TATCGATATC 


CATATGTAAA 


TGATACTAAA 


960 




AAGCCCTCAA 


AACAATCTGT 


TTCTGAATTG 


AAAAGACAAT 


ATGAAACAGA 


AGAAAGTGGC 


1020 


50 


ACAAGTTACG 


AACGAGTAAG 


GCAATATCGT 


ATCGGTTTTT 


CAACGTATGA 


ACGACCTAAA 


1080 




TTTCTAAGTG 


AACAAGGTAA 


ACGAAAAGCG 


AATGAAATTG 


GTACGTTAAT 


GCATACAGTG 


1140 
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GATGGATTAA TCGATAAACA TATTATCGAA GCAGATGCGA AAAAAGATAT CCGTATGGAT 
GAAATAATGA CATTTATCAA TAGTGATTAT ATTCGATATT GCTGAAGC 
(2) INFORMATION FOR SEQ ID NO: 77: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1431 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GATGCCATTn ATnnGTATGC AAGAAGTTGT TCCGGGTTCA GGTGGATTaC CAGTTGGTAC 
TGGTGGTAAG ACGTTACTAA TGCTTTCAGG CGGTATAGAC TCACCAGTTG CTGGGATGGA 
AGTGATGAGA CGTGGCGTAA CAATTGAAGC GATTCATTTC CATAGTCCAC CATTTACAAG 
TGATCAAGCA AAAGAAAAAG TTATTGAATT GACACGTATT TTAGCTGAAC GTGTTGGACC 
AATTAAATTG CATATTGTAC CATTTACAGA ATTGCAAAAA CAGGTAAATA AAGTTGTACA 
TCCAAGATAT ACAATGACTT CAACGAGACG TATGATGATG CGTGTTGCTG ATAAATTAGT 
ACATCAAATA GGGGCTTTAG CTATTGTAAA TGGTGAAAAC CTAGGGCAGG TAGCCAGTCA 
AACACTTCAT AGCATGTATG CAATTAATAA TGTAACTTCT ACTCCTGTAT TACGTCCTTT 
ATTAACTTAC GATAAAGAAG AAATTATTAT TAAATCGAAA GAAATTGGTA CATTTGAAAC 
ATCTATTCAA CCATTTGAAG ATTGTTGTAC AATTTTCACC CCTAAAAATC CAGTAACCGA 
ACCAAACTTT GATAAGGTAG TCCAATATGA AAGTGTCTTT GATTTTGAAG AGATGATTAA 
TCGTGCTGTT GAAAATATTG AAACACTTGA AATAACTAGT GATTATAAAA CTATTAAAGA 
ACAGCAAACA AACCAATTAA TAAACGACTT TTTATAAATA AAATCCTAGA GTAAATTTAA 
ACATAAGGGG ATGTTAAACT ATGGATTTGA ACTTAACGAT GATTATAATC ATAATTTTAT 
TTGGTTTTAT CGCGGCGTTT ATAGATTCGG TTGTAGGGGG TGGCGGTTTA ATTTCTACGC 
CAGCATTATT AGCAATCGGT CTACCACCAT CTGTGGCTTT AGGTACAAAT AAATTGGCAA 
GTTCGTTTGG TTCTTTAACT AGTACGATAA AGTTTATAAG GTCCGGTAAA GTGGACTTAT 
ATGTTGTTGC CAAATTATTT GGTTTTGTAT TTTTGGCATC TGCATGTGGC GCATATATTG 
CAACGATGGT TCCGTCACAA ATATTGAAAC CTTTAATCAT CATTGCACTT TCGTCGGTGT 
TTATATTCAC ATTACTTAAA AAAGATTGGG GCAATACACG CACGTTTACT CAATTTACAT 
TTAAGAAAGC CATAATATTT GCAGCACTTT TTATATTAAT CGG CT T TT AT GATGGATTTG 
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TAAGTGCAGC AGGAAATGCT AAAGTTTTGA ACTTTGCTTC TAATATAGGT GCGCTTGTAT 1380 
TATTTATGGT ATTAGGACAA GTAGATTATG TAATAGGTTT AATTATGGCT A 1431 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4403 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



25 



30 



,5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AATATTATTT TAAATTCAAT ATTTATTGGT GCATTTATTT TAAACTTATT ATTCGCCTTT 60 

ACCATTATTT TCATGGAAAG ACGTTCTGCC AATTCTATCT GGGCTTGGTT ACTAGTCTTA 12 0 

20 GTTTTCTTGC CTTTATTCGG CTTCATTTTA TACTTACTAT TAGGACGACA AATTCAACGT 180 

GACCAAATTT TCAAAATTGA TAAGGAAGAT AAAAAAGGAT TAGAGTTAAT CGTTGATGAG 24 0 

CAATTAGCTG CTTTAAAAAA TGAAAACTTT TCAAATTCCA ATTATCAAAT TGTAAAATTT 300 

AAAGAAATGA TTCAAATGTT GTTATATAAT AACGCAGCAT TTTTAACAAC AGACAACGAT 3 60 

TTArrrrtAT ACACAGACGG CCAAGAAAAA TTTGATGACC TAATACAAGA CATCCGTAAT 42 0 

GCTACTGATT ATATTCATTT TCAGTACTAT ATTATTCAAA ATGATGAATT AGGTCGTACC 4 80 

ATTTTAAATG AACTTGGTAA AAAAGCGGAA CAAGGTGTAG AAGTTAAAAT TCTTTATGAT 54 0 

GACATGGGTT CTCGTGGACT GCGTAAAAAA GGCTTACGCC CGTTTCGCAA TAAAGGTGGA 600 

CATGCTGAAG CATTTTTCCC ATCAAAATTA CCTTTAATTA ACTTGCGTAT GAACAATCGA 660 

AACCATCGAA AAATTGTTGT AATAGATGGG CAAATTGGAT ATGTTGGTGG TTTTAATGTT 720 

GGTGATGAGT ACTTAGGTAA ATCAAAAAAA TTCGGCTATT GGCGAGATAC GCATTTACGA 730 

40 ATTGTCGGGG ATGCAGTGAA TGCATTGCAA TTACGATTTA TTCTAGATTG GAATTCACAA 840 

GCCACACGTG ACCACATCTC CTATGATGAT CGTTATTTCC CAGATGTAAA TTCTGGTGGA 900 

ACAATTGGCG TTCAAATAGC TTCTAGTGGT CCTGACGAAG AATGGGAACA GATTAAATAC 960 

GGCTATTTGA AAATGATTTC ATCTGCTAAA AAATCGATTT ATATTCAATC TCCCTATTTC 1020 

ATACCTGATC AAGCCTTTTT AGATTCTATT AAAATTGCGG CATTAGGTGG TGTTGATGTC 1080 

AATATCATGA TTCCTAATAA ACCTGACCAT CCGTTTGTTT TTTGGGCTAC TTTAAAAAAT 1140 

GCAGCATCCT TATTAGATGC CGGTGTTAAA GTATTTCACT ACGACAATGG CTTTTTACAC 1200 

TCAAAAACAC TTGTTATAGA TGATGAAATT GCAAGTGTGG GAACAGCTAA TATGGACCAT 1260 
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TAACAAATAA AGGTGCGTTA TTAATAACAG TGCCAGGCAA AAATGATGAA GTACAACGCT 180 

GTATTACTGC TCATGTTGAT ACTTTAGGTG CaATGGTTAA AGAAATTAAA GAAGATGGTC 240 

5 

GCTTaGCAAT AGAATTAATT GGAGGATTCA CGTATAACGC GATTGAGGGT GAATATTGCC 300 

AAATTAAAAC TGATGCTGGT CAAATATATA CAGGAACAAT TTGTCTGCAT GAAACAAGTG 360 

TTCATGTATA TAGAAATAAT CATGAAATAC CTAGAGATCA AAAGCATATG GAAATAAGAA 420 

10 

TTGATGAAGT AACTACATCA GAAGAAGATA CAAAGAGTTT AGGTATTTCA GTAGGTGATT 480 

TTGTTAGCTT TGATCCACGT ACAGTTATCA CGTCATCAGG TTTTATTAAA TCTCGTCATT 540 

15 TAGATGATAA AGCTAGCGTA CGgTtGATAC TACAATTACT AAAGAAATTA AAAGAAGAGC 600 

AAATAATATT ACCACATACA ACGCAATTTT ATATTTCTAA TAACGAAGAA ATAGGTTACG 660 

GTGCAAATGC ATCAATTGAT TCGAAAATCA AAGAATATAT TGCATTAGAT ATGGGCGCGT 720 

20 TGGGAGACGG TCAAGCATCG GATGAATATA CAGTTTCTAT TTGTGCCAAA GATGCTTCAG 780 

GTCCATATCA TAAGCAATTG AAATCGCACC TAGTTAATCT TTGCAAAATA AATAACATTC 840 

« 

CATATAAAGT AGACATATAT CCATATTATG GTTCAGATGC TTCAGCAGCT TTACATGCTG 900 

25 

GTGCGGATAT CAGACATGGT TTATTTGGCG CTGGCATTGA ATCATCTCAT GCAATGGAAC 960 

GAACACATAT TGATTCTATT AAAGCGACAG AGAAATTACT ATATGCATAT TGCTTATCAC 1020 

CAATTGAGTA AACAATTAGT GTTGACAAAT GTGaACGACC TATGTAATAT AATGAACTAT 1080 

30 

AAAAATAATT AGAATTTTCT AAAGAAATAG TAGCAGATAT GAAACGTAGC AAATAGAAAG 1140 

CTAATGGGTG ATGGGAATTA GCACGCCATA TCTTGTGAAT TGGACTTTGG AAAACAATTG 1200 

35 AATGAGTTTT GAAAGTGAAC ATGAATTATG TTAACTAAGG TGGCACCACG GTAACGCGTC 1260 

CTTACAGGTA TATGCGTTAT GTGGTGTCTT TTTATTTAGA CAAAATGTAG TAGTTAATTA 1320 

AAGGTAGCAA CAGAAAGTTA GTGGATGATG TGAACTAACA CCGAGATTAA TGAAATTGGG 13 80 

■ 

40 TTTTGTCTGC AACAGAAAAA TTATATATAG TAAAGAGTGA ACTATGAATA TTTCGAATAT 1440 

TCGGTTAATT TAGGTGGTAC CACGCGTCAc nTCCTTTATA TTGATAAGGA TGCTGGCGCT 1500 

TTTTTGAAAG GAGCGTATAG AATGGATATA TTTTATAAAA AAATAAAAGC AAATGTAACG 1560 

45 

CCCGAAGTTT TAGCACAACT TCATTCCAAG AAGaTCATTT TGGAAAGTAC AAATCAACAA 1620 

CAAACTAAAG GTCGCTATTC AGTTGTTATT TTTGATATTT ATGGCACTTT AACTTTAGAT 1680 

AATGATGTAT TATCAGTAAG TACTTTAAAA GAATCGTATC AAATCACTGA AAGACCGTAC 1740 

50 

CATTATTTAA CGACTAAnAT AAATGAAGAC TACCATAATA TTCCAAGATG AGGCAACTTA 1800 

AGTCATTA 1808 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

TGGTCGTCAA TTTCTTGATT ATATCTATAA TCCTCATTTT CAATATTAGA GTCTGTAGAA 

TCATCGATAT TATTATCATT CGCATGACTA GAAGCAGAAT CATTATTTTT ATCATTGCTT 

TCTTCTTTTT TGAAGTCTTT ATTTATCAAG TAAATTTCTT CATCAAAATC AGCTTGTTGA 

GATGTATCAT CTTTATTTTG ATTAGAAAAA TGTGTAGCCT TTGATCTTTT TCTTTGCCGT 

CTTTTCTTAG ATGTATTCCT CGTAAATAAT TCTAATTCAT CTTTATCTTC ATTTGATTCT 

TGTTGATCGT TCTTCGTTTT ATCATCCATC AATACTCACA CCCTTTAATA AGATGGTAAA 

TGGGCACGGA ATCTTTCAAT AAATTTCTCT CCACGCTCTT CAAAAGTACT ATATTGATCC 

CAACTCGCAC AAGCAGGTGA CAATAATACA ACATCATTTG GTTCTATAAT ATCTTGTACT 

TTATCAACAG CGTCTTCGAC ATTGTTCGCT TCAATGACCG ATTTCCCTTG ACTATTACCT 

AGTTTAGCAA ACTT AG CTTT CGTTTGTCCG AATACAACCA TCGCGCGAAC ATTTTCCATA 

TAAGGAATGA GTTCGTCAAA TTCATTCCCT CGATCCAAAC CACCACATAA CCAAATGATT 

GGTTGATTAA ATGAATTTAA GGCAAACTGT GTTGCTAGCG TGTTTGTTGC TTTGGAATCA 

TTATAATATT TATTAGTTCT ATTAGTACCA ACATATTGCA ATCTATGCTC TATTCCTGAA 

AATGTAGTTA AACTATCAAT AATTGCtTTA ATAGGTACAC CAGCanAATA CAAGCAAGCA 

CAGCTGCTAA TATATTTcTA AATTATGTTC ACCAGGCAAT ACTAGAtCTT CAGTGTTAAT 

AATaCGAACA CCTTTATaAA CGATAAAACC ATCTTtAATA TAAaTACCAT CArCTtCTTG 

TTGAGTTGAG AAATACAATG TCTTAGCTTT TAATTCTTCC GACTCTATCA CTTGTCTTTG 

ATGATAATTA CAAATCAAAT AATCCTCTTC CGTTTGATTT TTATATATTT GCTTTTTAGC 

ATTTTGATAG TTTTCTAAAT TTTCATGGTA ATCTAGATGC GCCGAATAAA TGTTAGTAAT 

TATAGCAATG TGTGGTTTAT ACTTTTCGAT TCCAAGTAAC TGGAATGACG ACAACTCTGT 

AACTAAATAA TCTGTAGGCT TTACTTCTTG TGCTACTTTA GATGCAACAT AACCAATATT 

GCCGGATAAT CTTCCAGTTA AGCGACTTTT TTTAAACATA TCTCCAATTA GAGAAGTAAC 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4280 base pairs 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

TTTACACCAA TCAAAAAATC GAACTGATAT AAATAAGTAC AAAGCTTATC TATCAATCCG 60 

10 ATTTAGTTAT AAAACAAAAA AAGCCACAGT AATGTGGCTT TTTGTTATAT TCAGTATCAA 120 

AATGGTATCA ATAGCCATTT TCGGAAGTCA AGAATGGCTT AACAACGCGG TTTAAAGCTA 180 

TCCAATACTA CCTTCCATTT CGAACTTGAT TAAACGGTTC ATTTCGACCG CGTATTCCAT 240 

15 TGGAAGTTCT TTTGTAAATG GTTCGATGAA TCCCATAACA ATCATTTCTG TCGCTTCTTC 3 00 

TTCAGAAATA CCACGACTCA TTAGATAGAA TAATTGTTCT TCAGAAACTT TTGAAACCTT 3 60 

GGCTTCATGT TCTAATGATA TTTGATCGTT GAATACTTCG TTATATGGAA TTGTATCTGA 420 

20 

TGTTGATTCG TTATCTAAGA TTAATGTATC ACATTCAATA TTTGAACGAG CACCTTTTGC 4 80 

TTTACGTCCA AAATGAACAA TACCGCGATA AATAACTTTA CCACCATTTT TAGAAATAGA 540 

TTTAGAAACA ATTGTAGAAG ATGTATTAGG TGCTTTATGA ATCATTTTAG CACCGGCATC 600 

25 

TTGAACTTGT CCTTTACCAG CAAATGCAAT AGATAATGTA CTACCTTTTG CACCTTCACC 660 

TAAAAGAACA CAGTTTGGAT ATTTCATCGT TAACTTAGAA CCTAAGTTAC CAT CT AC CCA 720 

TTCCATATTT CCGTTTTCAT AAACAAAAGT ACGTTTTGTA ACTAAATTGT ATACATTGTT 780 

30 

CGCCCAGTTT TGAATCGTAG TATAACGAAC GTGCGCATCT TTATGCACAA TGATTTCCAC 84 0 

AACAGCAGAG TGTAAAGAAC TAGTTGTATA AACTGGTGCA GTACAACCTT CTACGTAATG 900 

35 TACAGAAGCA CCTTCATCAG CAATGATTAA TGTACGTTCA AATTGACCCA TGTTCTCAGA 960 

GTTAATACGG AAATAAGCTT GTAGTGGCGT ATCTAGTTTG ATATTTTTAG GTACATAAAT 1020 

GAAGGAACCA CCTGACCATA CTGCTGAGTT TAACGCCGCA AATTTGTTAT CTGCTGCAGG 1080 

■ 

40 TACTACAGAA GCAAAGTATT TTTTGAATAA TTCTTCATTT TCTTGTAAAG CACTATCTGT 1140 

ATCTTTAAAG ATAATACCTT TTTCTTCAAG TTCTTTTTCC ATATTATGGT AAACAACTTC 1200 

AGATTCATAT TGAGCAGAAA CACCAGCTAA ATATTTTTGT TCAGCTTCAG GAATTCCTAA 1260 

45 

TTTATCGAAA GTTCTTTTAA TTTCTTCTGG CACTTCATCC CATGAACGTT CAGCTTGTTC 1320 

TGAAGGCTTT ACATAGTAAG TAATGTCATC GAAATTCAAT TCTGATAAGT CGCCACCCCA 1380 

TTGAGGCATT GGCATTTTAT AAAACAATTT TAATGATTTA AGACGGAAAT CTAACATCCA 1440 

50 . - ~ 

TTCCGGCTCA TTTTTCATGT TAGAAATTTC TCTAACGATA TTCTCAGTTA AACCACGTTC 1500 

TGATCTGAAA ATGGACACAT CATCGTCGTG GAATCCATAT TTATAATCCC CAACATCAGG 1560 
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TTTAATTCAT GATGTAAACC ATATTATAAC AATGACATGA CATCTTATAA AAATTTTTAT 
ACTTTTATAT GTCTAATATC AAAATTATCT ATGATTAACA GCATTCTATT CTTCTTCAGT 
CGTACCTTCT GCTTTACCTT CTTTAGCAAC AGTACCTTTT TCCAATGCTT TCCAAGCTAA 
TGTGGCACAT TTAATACGAG CTGGGAATTG AGATACACCT TGCAATGCTT CAATATCTCC 
CATTTCTTCT GTAATCACAT AGTCTTCACC AAGCATCATT TTCGTAAATT CTTGGCTCAT 
TTGCATTGCT TCTCCAAGTG AATGACCTTT AACAGCTTGT GT CAT CATC G ATGCACTTGC 

* 

CATTGAAATC GAACAACCTT CACCTTCAAA CTTAGCATCT TTTATAATGC CGTCTTCTAT 
ATCAAATGTT AGTCGTATAC GGTCACCGCA TGTCGGGTTA TTCATATCTA CTGTCATAGA 
CCCGTTATCT AATACAC CTT TATTTCTAGG ATTTTTATAA TGATCCATAA TGACAGATCT 
ATATAATTGA TCTAGATTAT TAAAATTCAT AAGAGAAAAA CTCCTTCGTT TGTTTCAAGG 
CATTTATTAA CTGATCAACG TCTTCTTTCG TGTTGTATAT ATAAAAACTC GCTCTAGCTG 
TTGAAGACAC ATTTAAC CAT TTCATTAACG GTTGCGCACA ATGATGCCCA GCTCTAACCG 
CTACACCTTC TGTATCTACG GCTGTAGCAA CATCGTGTGG ATGTACATCT TGTAAATTAA 
ACGTTATTAC ACCTGCACGA CGATCCTTTG GCGGGCCATA AATTTCAATT CCTTCAATTG 
CAGACATTTG CTCATAAGCA TATATCGTTA ATTCTTGTTC ATATTTATGA ATTGCATCAA 
AACCTATGCG TTCTAAATAG CGAATAGCTT CTGCAAGCCC AATTGCTTGA GCAATTAATG 
GAGTACCCGC CTCAAATTTA GTAGGTAAAT CAGCCCATGT TGCATCATAC TTACTTACAA 
AATCAATCAT GTCGCCACCG AACTCAATCG GTTCCATTTT TTGTAGTAAC TCACGTTTAC 
CAAATAATAC GCCAATACCT GTTGGTCCAA GCATTTTATG ACCACTAAAA CTATAAAAAT 
CAGCATTCAT TTCTTGCATA TCAAGTTTCA TATGTGGTGC TGctTGCGCC CCATCAACAC 
TGATSATTGC ACCATGTTGA TGAGCTATTT CTGCAATGGT TTTAACATCA TTAATTGTAC 
CGAGCACATT AGATATATGT GCAATAGCAA CGATCTTTGT TTTATCATTA ATCGTTTGCT 
TAATATCCTC GATGTTTAAT TCACCGTCAG CTGTCATTGG TATAAATTTC AATGTCGCAT 
TTTTACGCTT TGCTAACTGT TGCCAAGGAA CAATATTGGC ATGATGTTCC ATTTCAGTGA 
CAACAATTTC ATCGCCCTCT TCAACATTTG CATCAC CAT A GCTATGTGCT ACAAGGTTAA 
TCGACGCAGT TGTTCCGCGT GTAAAAATGA TTTCTTCAAA ATACTTCGCA TTAATAAAAC 
GACGAACGGT TTCACGGGCA TTTTCATAAC CATCAGTTGC CAATGATCCT AATGTATGAA 
CACCACGATG AACGTTTGAA TTATAACGCT TGTAGTAATC TTCTAAAACA TTTAACACTT 
GCACAGGCGT TTGACTTGTC GCTGTTGAAT CAAGATATGC TAAACGTTTG CCATTGACTT 



1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2530 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 
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CTTCATTCAC GACCTTTCTT AAATAAAAAT CCTAATCATT TAAATACTGA CGTTGTATTA 34 30 

GTCTTATACC AATATCGACA GTCTATATCT ATTACAAACT TTTATTTTCA AAATATTATT 3540 

TAGAAACTTT GCGTTCAATT ACTTCTCTCA ATTGACGTTT AACGTCTTCG ATAGGTAATT 3600 

CACGTACTAC TGGATCTAAG AAACCATGTA TAACAAGACG TTCCGCTTCT CTTTGAGAAA 3650 

TACCACGACT CATTAAATAG TAAAGTTGAT CTGGATCAAC ACGACCTACT GATGCAGCAT 3720 

GACCAGCTTG TACATCATCT TCATCAATTA ATAAAATAGG ATTCGCGTCA CCACGAGCAT 3780 

GTTCAGATAA CATTAATACA CGTGATTCCT GATTAGCAAT TGATTTAGTT CCACCATGCT 3840 

TAATGTAGCC GATACCATTA AATACAGACG ATGCATGTTC TTTCATAACA CCATGTTTAA 3900 

GGATATAACC ATCTGTTTCT TTACCATATT GTACGATTTT AGATGTTAGA TTAATTTTTT 3960 

GTTCGCCTGT ACCTACAACT ACTGATTTAA GTGAACTTGT TGAACGATCA CCAAATAAAT 4020 

20 TTGTTGTATT ATCAATAATT TGGCTACCCT CATTCATTAA ACCTAGTGCC CAATTAATTG 4080 

AGGCATCCGC TTCAGTAATA CCACGTCGAA TGATATGACC TGTAAAGCCT TTATCCATAT 4140 

AGTCCACTGA GCCATATGTG ATATTTGAAT TTGCACCAGC AATCACTTCA GAAATAATAT 4200 

25 TtAATTGATT TCCTTCACCA GATGCATTTG mTAAGTAATT TTCAACATAT GTGACTTCGG 4260 

CGCTTTCTTC AGTAACGATG 4280 
(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15598 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



- <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

TCnGACT CGA ACGGTGmAAC TAttCCGTTG TaATTCCgGA GgAAsCAAGG TATGCCCATC 60 

TGCaAAGAAA gaATGsAATG AACTTTTTGG AAATGTAGAA GTGGTAAATA AAGATAAAGG 120 

ATATTACATT CTGAGAAGTA TAAAAGCTTG AAATGAAATG GATATTCTGT TATAGTTATA 180 

45 TAATGTAAAA ATTTATGTTC AATAAGTGTG TACTTTTACG TTAAATAGAT AAGTTAATTA 240 

AGAATAAATA TAGAATCGAA AATGGTGTCA TCATTAGTGT TGCCGTTTTC TTTTTGTCTT 300 

TTTATTAATA TGCTTATGGT ATTTAGCTAA AAGCGGATCA CATAATTTTT GAGGGGTGAA 360 

TCTGTTTGGC AGGTCAAGTT GTCCAATATG GAAGACATCG TAAACGTAGA AACTACG CGA 420 

GAATTTCAGA AGTATTAGAA TTACCAAACT TAATAGAAAT TCAAACTAAA TCTTACGAGT 480 
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CTGGTAATTT 


GTCATTAGAG 


TTTGTGGATT 


ACCGTTTAGG 


AGAACCAAAA 


TATGATTTAG 


600 




AAGAATCTAA AAACCGTGAC 


GCTACTTATG 


CTGCACCTCT 


TCGTGTAAAA 


GTGCGTCTAA 


660 


5 


TCATTAAAGA 


AACAGGAGAA 


GTTAAAGAAC 


AAGAAGTCTT 


TATGGGTGAT 


TTCCCATTAA 


720 




TGACTGATAC 


AGGTACGTTC 


GTTATCAATG 


GTGCAGAACG 


TGTAATCGTA 


TCTCAATTAG 


780 


io 


TTCGTTCACC 


ATCCGTTTAT 


TTCAATGAAA AAATCGACAA AAATGGTCGT 


GAAAACTATG 


840 


ATGCAACAAT 


TATTCCAAAC 


CGTGGTGCAT 


GGTTAGAATA 


TGAAACAGAT 


GCTAAAGATG 


900 




TTGTATACGT 


ACGTATTGAT 


AGAACACGTA 


AACTACCATT 


AACAGTATTG 


TTACGTGCAT 


960 


15 


TAGGTTTCTC 


AAGCGACCAA 


GAAATTGTTG 


ACCTTTTAGG 


TGACAATGAA 


TATTTACGTA 


1020 


ATACTTTAGA 


GAAAGACGGC 


ACTGAAAACA 


CTGAACAAGC 


GTTATTAGAA 


ATCTATGAAC 


1080 




GTTTACGTCC AGGTGAACCA CCAACTGTTG AAAATGCTAA AAGTCTATTG TATTCACGTT 


1140 


on 


TCTTTGATCC 


AAAACGCTAT 


GACTTAGCAA GCGTGGGTCG 


TTATAAAACA 


AACAAAAAAT 


1200 




TACATTTAAA 


ACATCGTTTA 


TTTAATCAAA AATTAGCTGA 


GCCAATTGTA 


AATACTGAAA 


1260 




CTGGTGAAAT 


TGTAGTTGAA 


GAAGGTACAG 


TGCTTGATCG 


TCGTAAAATC 


GACGAAATCA 


1320 


25 


TGGATGTACT TGAATCAAAT 


GCAAACAGCG 


AAGTGTTTGA 


ATTGCATGGT 


AGCGTTATAG 


1380 




ACGAGCCAGT 


AGAAATTCAA 


TCAATTAAAG 


TATATGTTCC 


TAACGATGAT 


GAAGGTCGTA 


1440 




CGACAACTGT 


AATTGGTAAT 


GCTTTCCCTG 


ACTCAGAAGT 


TAAATGCATT 


ACACCAGCAG 


1500 


30 


ATATCATTGC 


TTCAATGAGT 


TACTTCTTTA 


ACTTATTAAG 


CGGTATTGGA 


TATACAGATG 


1560 




ATATTGACCA 


TTTAGGTAAC 


CGTCGTTTAC 


GTTCTGTAGG 


TGAATTACTA 


CAAAACCAAT 


1620 


35 


TCCGTATCGG 


TTTATCAAGA 


ATGGAAAGAG 


TTGTACGTGA 


AAGAATGTCA 


ATTCAAGATA 


1680 


CTGAGTCTAT 


CACACCTCAA 


CAATTAATTA 


ATATTCGACC 


TGTTATTGCA 


TCTATTAAAG 


1740 




aattZttttgg TAGCTCTCAA TTATCACAAT TCATGGACCA agcaaaccca 

• 


TTAGCTGAGT 


1800 


40 


TAACGCATAA 


ACGTCGTCTA 


TCAGCATTAG 


GACCTGGTGG 


TTTAACACGT 


GAACGTGCTC 


1860 


AAATGGAAGT 


ACGTGACGTT 


CACTACTCTC 


ACTATGGCCG 


TATGTGTCCA 


ATTGAAACAC 


1920 




CTGAGGGACC 


AAACATTGGA 


TTGATTAACT 


CATTATCAAG 


TTATGCACGT 


GTAAATGAAT 


1980 


45 


TCGGCTTTAT 


TGAAACACCA 


TATCGTAAAG 


TTGATTTAGA 


TACACATGCT 


ATCACTGATC 


ZU4 0 




AAATTGACTA 


TTTAACAGCT 


GACGAAGAAG 


ATAGCTATGT 


TGTAGCACAA 


GCAAACTCTA 


2100 




AATTAGATGA 


AAATGGTCGT 


TTCATGGATG 


ATGAAGTTGT 


ATGTCGTTTC 


CGTGGTAACA 


2160 


SO 


ATACAGTTAT 


GGCTAAAGAA 


AAAATGGATT 


ATATGGATGT 


ATCGCCGAAG 


CAAGTTGTTT 


2220 




CAGCAGCGAC AgcATGTATT CCATTCTTAG AAAATGATGA CTCAAACCGT GCATTGATGG 


2280 
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CAGGTATGGA ACACGTTGCA GCACGTGATT CTGGTGCGGC TATTACAGCT AAGCACAGAG 2400 

GTCGTGTTGA ACATGTTGAA TCTAATGAAA TTCTTGTTCG TCGTCTAGTT GAAGAGAACG 2460 

GCGTTGAGCA TGAAGGTGAA TTAGATCGCT ATCCATTAGC TAAATTTAAA CGTTCAAACT 2520 

CAGGTACATG TTACAACCAA CGTCCAATCG TTGCAGTTGG AGATGTTGTT GAGTATAACG 2530 

AGATTTTAGC AGATGGACCA TCTATGGAAT TAGGAGAAAT GGCATTAGGT AGAAACGTAG 2640 

TAGTTGGTTT CATGACTTGG GACGGTTACA ACTATGAGGA TGCCGTTATC ATGAGTGAAA 2700 

GACTTGTGAA AGATGACGTG TATACTTCTA TTCATATTGA AGAGTATGAA TCAGAAGCAC 2760 

GTGATACTAA GTTAGGACCT GAAGAAATCA CAAGAGATAT TCCTAATGTT TCTGAAAGTG 2820 

CACTTAAGAA CTTAGACGAT CGTGGTATCG TTTATATTGG TGCAGAAGTA AAAGATGGAG 2880 

ATATTTTAGT TGGTAAAGTA ACGCCTAAAG GTGTAACTGA GTTAACTGCC GAAGAAAGAT 2940 

20 TGTTACATGC AATCTTTGGT GAAAAAGCAC GTGAAGTTAG AGATACTTCA TTACGTGTAC 3000 

CTCACGGCGC TGGCGGTATC GTTCTTGATG TAAAAGTATT CAATCGTGAA GAAGGCGACG 3060 

ATACATTATC ACCTGGTGTA AACCAATTAG TACGTGTATA TATCGTTCAA AAACGTAAAA 3120 

25 TTCATGTTGG TGATAAGATG TGTGGTCGAC ATGGTAACAA AGGTGTCATT TCTAAGATTG 3180 

TTCCTGAAGA AGATATGCCT TACTTACCAG ATGGACGTCC GATCGATATC ATGTTAAATC 3240 

CTCTTGGTGT ACCATCTCGT ATGAACATCG GACAAGTATT AGAGCTACAC TTAGGTATGG 33 00 

CTGCTAAAAA TCTTGGTATT CACGTTGCAT CACCAGTATT TGACGGTGCA AACGATGACG 33 60 

ATGTATGGTC AACAATTGAA GAAGCTGGTA TGGCTCGTGA TGGTAAAACT GTACTTTATG 3420 

ATGGACGTAC AGGTGAACCA TTCGATAACC GTATTTCAGT AGGTGTAATG TACATGTTGA 34 80 

AACTTGCGCA CATGGTTGAT GATAAATTAC ATGCGCGTTC AACAGGACCA TATTCACTTG 3540 

tTACACAACA ACCACTTGGC GGTAAAGCGC AATTCGGTGG ACAACGTTTT GGTGAGATGG 3600 

AGGTATGGGC ACTTGAAGCA TATGGTGCTG CATACACATT ACAAGAAATC TTAACTTACA 3660 

AATCCGATGA TACAGTAGGA CGTGTGAAAA CATACGAGGC TATTGTTAAA GGTGAAAACA 3720 

TCTCTAGACC AAGTGTTCCA GAATCATTCC GAGTATTGAT GAAAGAATTA CAAAGTTTAG 3780 

45 GTTTAGATGT AAAAGTTATG GATGAGCAAG ATAATGAAAT CGAAATGACA GACGTTGATG 3840 

ACGATGATGT TGTAGAACGC AAAGTAGATT TACAACAAAA TGATGCTCCT GAAACACAAA 3900 

AAGAAGTTAC TGATTAATAC GCAATTTACA AAACAGGCAA AAAGATACTA AGCTGAATTT 3960 

50 TATTGATGAT TCAGTTTAGT ACTTTAAGCC ATTTTAAATA AATGCAAATC AATCAAATAG 4020 

CACAGCTAAT CTAAATTGAA GGAGGTAGGC TCCTTGATTG ATGTAAATAA TTTCCATTAT 4080 
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AAACCTGAAA CAATCAACTA CCGTACATTA AAACCTGAAA AAGATGGTCT ATTCTGTGAA 4200 

AGAATTTTCG GACCTACAAA AGACTGGGAA TGTAGTTGTG GTAAATACAA ACGTGTTCGC 4260 

TACAAAGGCA TGGTCTGTGA CAGATGTGGA GTTGAAGTAA CTAAATCTAA AGTACGTCGT 4 320 

GAAAGAATGG GTCACATTGA ACTTGCTGCT CCAGTTTCTC ACATTTGGTA TTTCAAAGGT 4 3 80 

ATACCAAGTC GTATGGGATT ATTACTTGAC ATGTCACCAA GAGCATTAGA AGAAGTTATT 4440 

TACTTTGCTT CTTATGTTGT TGTAGATCCA GGTCCAACTG GTTTAGAAAA GAAAACTTTA 4 500 

TTATCTGAAG CTGAATTCAG AGATTATTAT GATAAATACC CAGGTCAATT CGTTGCAAAA 4560 

ATGGGTGCAG AAGGTATTAA AGATTTACTT GAAGAGATTG ATCTTGACGA AGAACTTAAA 4620 

TTGTTACGCG ATGAGTTGGA ATCAGCTACT GGTCAAAGAC TTACTCGTGC AATTAAACGT 4680 

TTAGAAGTTG TTGAATCATT CCGTAATTCA GGTAACAAAC CTTCATGGAT GATTTTAGAT 4 740 

20 GTACTTCCAA TCATCCCACC AGAAATTCGT CCAATGGTTC AATTAGATGG TGGACGATTT 4 800 

GCAACAAGTG ACTTAAACGA CTTATACCGT CGTGTAATTA ATCGAAATAA TCGTTTGAAA 4 860 

CGTTTATTAG ATTTAGGTGC ACCTGGTATC ATCGTTCAAA ACGAAAAACG TATGTTACAA 4920 

25 GAAGCCGTTG ACGCTTTAAT TGATAATGGT CG7CGTGGTC GTCCAGTTAC TGGCCCAGGT 4980 

AACCGTCCAT TAAAATCTTT ATCTCATATG TTAAAAGGTA AACAAGGTCG TTTCCGTCAA 5040 

AACTTACTTG GTAAACGTGT TGACTATTCA GGACGTTCAG TTATTGCAGT AGGTCCAAGC 5100 

TTGAAAATGT ACCAATGTGG TTTACCAAAA GAAATGGCAC TTGAACTATT TAAACCATTC 5160 

GTAATGAAAG AATTAGTTCA ACGTGAAATT GCAACTAACA TTAAAAATGC GAAGAGTAAA 5220 

ATCGAACGTA TGGATGATGA AGTTTGGGAC GTATTGGAAG AAGTAATTAG AGAACATCCT 5230 

GTATTACTTA ACCGTGCACC AACACTTCAT AGACTTGGTA TTCAAGCATT TGAACCAACT 5340 

TTAGTTGAAG GTCGTGCGAT TCGTCTACAT CCACTTGTAA CAACAGCTTA TAACGCTGAC 5400 

TTTGACGGTG ACCAAATGGC GGTTCACGTT CCTTTATCAA AAGAGGCACA AGCTGAAGCA 5450 

AGAATGTTGA TGTTAGCAGC ACAAAACATC TTGAACCCTA AAGATGGTAA ACCTGTAGTT 5520 

ACACCATCAC AAGATATGGT ACTTGGTAAC TATTACCTTA CTTTAGAAAG AAAAGATGCA 553 0 

45 GTAAATACAG GCGCAATCTT TAATAATACA AATGAAGTAT TAAAAGCATA TGCAAATGGC 5640 

TTTGTACATT TACACACTAG AATTGGTGTA CATGCAAGTT CGTTCAATAA TCCAACATTT 5700 

ACTGAAGAAC AAAACAAAAA GATTCTTGCT ACGTCAGTAG GTAAAATTAT ATTCAATGAA 5760 

50 ATCATTCCAG ATTCATTTGC TTATATTAAT GAACCTACGC AAGAAAACTT AGAAAGAAAG 5820 

ACACCAAACA GATATTTCAT CGATCCTACA ACTTTAGGTG AAGGTGGATT AAAAGAATAC 5880 
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GAAGTATTCA ACAGATTTAG CATCACTGAT ACATCAATGA TGTTAGACCG TATGAAAGAC 
TTAGGATTCA AATTCTCATC TAAAGCTGGT ATTACAGTAG GTGTTGCTGA TATCGTAGTA 
TTACCTGATA AGCAACAAAT ACTTGATGAG CATGAAAAAT TAGTCGACAG AATTACAAAA 
CAATTCAACC GTGGTTTAAT CACTGAAGAA GAAAGATATA ATGCAGTTGT TGAAATTTGG 
ACAGATGCAA AAGATCAAAT TCAAGGTGAA TTGATGCAAT CACTTGATAA AACTAACCCA 
ATCTTCATGA TGAGTGATTC AGGTGCCCGT GGTAACGCAT CTAACTTTAC ACAGTTAGCA 
GGTATGCGTG GATTGATGGC CGCACCATCT GGTAAGATTA TCGAATTACC AATCACATCT 
TCATTCCGTG AAGGTTTAAC AGTACTTGAA TACTTCATCT CAACTCACGG TGCACGTAAA 
GGTCTTGCCG ATACAGCACT TAAAACAGCT GACTCAGGAT ATCTTACTCG TCGTCTTGTT 
GACGTGGCAC AAGATGTTAT TGTTCGTGAA GAAGACTGTG GTACTGATAG AGGTTTATTA 
GTTTCTGATA TTAAAGAAGG TACAGAAATG ATTGAACCAT TTATCGAACG TATTGAAGGT 
CGTTATTCTA AAGAAACAAT TCGTCATCCT GAAACTGATG AAATAAT CAT TCGTCCTGAT 
GAATTAATTA CACCTGAAAT TGCTAAGAAA ATTACAGATG CTGGTATTGA ACAAATGTAT 
ATTCGCTCAG CATTTACTTG TAACGCACGA CATGGTGTTT GTGAAAAATG TTACGGTAAA 
AACCTTGCTA CTGGTGAAAA AGTTGAAGTT GGTGAAGCAG TTGGTACAAT TGCAGCCCAA 
TCTATCGGTG AACCAGGTAC ACAGCTTACA ATGCGTACAT TCCATACAGG TGGGGTAGCA 
GGTAGCGATA TCACACAAGG TCTTCCTCGT ATTCAAGAGA TTTTCGAAGC ACGTAACCcT 
AAAGGTCAAG CGGTAATTAC GGAAATCGAA GGTGTCGTAG AAGATATTAA ATTAGCAAAA 
GATAGACAAC AAGAAATTGT TGTTAAAGGT GCTAATGAAA CAAGATCATA CCTTGCTTCA 
GGTACTTCAA GAATTATTGT AGAAATCGGT CAACCAGTTC AACGTGGTGA AGTATTAACT 
GAAGGTTCTA TTGAACCTAA GAATTACTTA TCTGTTGCTG GATTAAACGC GACTGAAAGC 
TACTTATTAA AAGAAGTACA AAAAGTTTAC CGTATGCAAG GTGTAGAAAT CGACGATAAA 
CACGTTGAGG TTATGGTTCG ACAAATGTTA CGTAAAGTTA GAATTATCGA AGCAGGTGAT 
ACGAAGTTAT TACCAGGTTC ATTAGTTGAT ATTCATAACT TTACAGATGC AAATAGAGAA 
GCATTTAAAC ACCGTAAGCG TCCTGCAACA GCTAAACCAG TATTACTTGG TATTACTAAA 
GCATCACTTG AAACAGAAAG TTTCTTATCT GCAGCATCAT TCCAAGAAAC AACAAGAGTT 
CTTACAGATG CAGCAATTAA AGGTAAGCGT GATGACTTAT TAGGTCTTAA AGAAAACGTA 
ATTATTGGTA AGTTAATTCC AGCTGGTACT GGTATGAGAC GTTATAGCGA CGTAAAATAC 
GAAAAAACAG CTAAACCAGT TGCAGAAGTT GAATCTCAAA CTGAAGTAAC GGAATAACAA 
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ATGTTGACGA ATTCTCTTGT TCAATGTTAA TATATTAAAG GTTGATGCAA GCAGAACTTT 7800 

GGAGGATAAA TTATTGTCTA AGGAAAAAGT tGCACGCTTT AACAAACAAC ATTTTGTAGT 7860 

TGGTCTTAAA GAAACGCTTA AAGCGTTAAA GAAAGATCAA GTTACATCTT TGATTATTGC 7920 

TGAAGACGTT GAAGTATATT TAATGACTCG CGTGTTAAGC CAAATCAATC AGAAAAATAT 7980 

ACCTGTATCT TTTTTCAAAA GCAAACATGC TTTGGGTAAA CATGTAGGTA TTAACGTCAA 8040 

TGCGACAATA GTAGCATTGA TTAAATGAGA ATTAGTAAGT GTTTTACTTA CTAAATTTTA 8100 

TTTAACCTAA AAATGAACCA CCTGGATGTG TGGGATTAAA AAGTGAAGAG AGGAGGACAT 8160 

ATCACATGCC AACTATTAAC CAATTAGTAC GTAAACCAAG ACAAAGCAAA ATCAAAAAAT 8220 

CAGATTCTCC AGCTTTAAAT AAAGGTTTCA ACAGTAAAAA GAAAAAATTT ACTGACTTAA 8280 

ACTCACCACA AAAACGTGGT GTATGTACTC GTGTAGGTAC AATGACACCT AAAAAACCTA 8340 

2 0 ACTCAGCGTT ACGTAAATAT GCACGTGTGc gTtTATCAAA CAACATCGAA ATTAACGCAT 8400 

ACATCCCTGG TATCGGACAT AACTTACAAG AACACAGTGT TGTACTTGTA CGTGGTGGAC 8460 

GTGTAAAAGA CTTACCAGGT GTGCGTTACC ATATTGTACG TGGAGCACTT GATACTTCAG 8520 

25 GTGTTGACGG ACGTAGACAA GGTCGTTCAT TATACGGAAC TAAGAAACCT AAAAACTAAG 8580 

AATTTAGTTT TTAATTAAAT CTTAAACTTA AAATATTTAA TATAAGGAAG GGAGGATTTA 8640 

CATTATGCCT CGTAAAGGAT CAGTACCTAA AAGAGACGTA TTACCAGATC CAATTCATAA 87 00 

CTCTAAGTTA GTAACTAAAT TAATTAACAA AATTATGTTA GATGGTAAAC GTGGAACAGC 8760 

ACAAAGAATT CTTTATTCAG CATTCGACCT AGTTGAACAA CGCAGgtTCG TGATGCATTA 8820 

GAAGTATTCG AAGAAGCAAT CAACAACATT ATGCCAGTAT TAGAAGTTAA AGCTCGTCGC 88 30 

GTAGGTGGTT CTAACTATCA AGTACCAGTA GAAGTTCGTC CAGAGCGTCG TACTACTT7A 8940 

GGTTTACGTT GGTTAGTTAA CTATGCACGT CTTCGTGGTG AAAAAACGAT GGAAGATCGT 9000 

TTAGCTAACG AAATTTTAGA TGCAGCAAAT AATACAGGTG GTGCCGTTAA GAAACGTGAG 9060 

GACACTCACA AAATGGCTGA AGCAAACAAA GCATTTGCTC ACTACCGTTG GTAAGATAAA 9120 

AGCTTTTACC CTGAGTGTGT TCTATATTAA TGAATTTTCA TTAAGCGTTC ATGCTTAGGG 9130 

45 CATCGCCATA TCTATCGTAT TTATTCAGTA ATATAAACTG GAAGGAGAAA AAATACATGG 9240 

CTAGAGAATT TTCATTAGAA AAAACTCGTA ATATCGGTAT CATGGCTCAC ATTGATGCTG 93 00 

GTAAAACGAC TACGACTGAA CGTATTCTTT ATTACACTGG CCGTATCCAC AArGknGGTG 93 60 

SO AAaCACACGA AGGTGCTTCA CAAATGGACT GGATGGAGCA AGAACAAGAC CGTGGTATTA 9420 

CTATCACATC TGCTGCAACA ACAGCAGCTT GGGAAGGTCA CCGTGTAAAC ATTATCGATA 9480 
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GCCTAGGTTA AAATACAAGG TGAGCTTAAA TGTAAGCTAT CATCTTTATA GTTTGATTTT 1140 0 

TTGGGGTGAA TGCATTATAA AAGAATTGTA AAATTCTTTT TGCATCGCTA TAAATAATTT 1146 0 

CTCATGATGG TGAGAAACTA TCATGAGAGA TAAATTTAAA TATTATTTTT AATTAGAATA 11520 

GGAGAGATTT TATAATGGCA AAAGAAAAAT TCGATCGTTC TAAAGAACAT GCCAATATCG 11580 

GTACTATCGG TCACGTTGAC CATGGTAAAA CAACATTAAC AGCAGCAATC GCTACTGTAT 1164 0 

TAGCAAAAAA TGGTGACTCA GTTGCACAAT CATATGACAT GATTGACAAC GCTCCAGAAG 11700 

AAAAAGAACG TGGTATCACA ATCAATACTT CTCACATTGA GTACCAAACT GACAAACGTC 11760 

ACTACGCTCA CGTTGACTGC CCAGGACACG CTGACTACGT TAAAAACATG ATCACTGGTG 11820 

CTGCTCAAAT GGACGGCGGT ATCTTAGTAG TATCTGCTGC TGACGGTCCA ATGCCACAAA 11880 

CTCGTGAACA CATTCTTTTA TCACGTAACG TTGGTGTACC AGCATTAGTA GTATTCTTAA 11940 

ACAAAGTTGA CATGGTTGAC GATGAAGAAT TATTAGAATT AGTAGAAATG GAAGTTCGTG 12000 

ACTTATTAAG CGAATATGAC TTCCCAGGTG ACGATGTACC TGTAATCGCT GGTTCAGCAT 12060 

TAAAAGCTTT AGAAGGCGAT GCTCAATACG AAGAAAAAAT CTTAGAATTA ATGGAAGCTG 12120 

TAGAXACTTA CATTCCAACT CCAGAACGTG ATTCTGACAA ACCATTCATG ATGCCAGTTG 12180 

AGGACGTATT CTCAATCACT GGTCGTGGTA CTGTTGCTAC AGGCCGTGTT GAACGTGGTC 12240 

AAATCAAAGT TGGTGAAGAA GTTGAAATCA TCGGTTTACA TGACACATCT AAAACAACTG 12300 

TTACAGGTGT TGAAATGTTC CGTAAATTAT TAGACTACGC TGAAGCTGGT GACAACATTG 12360 

GTGCATTATT ACGTGGTGTT GCTCGTGAAG ACG7ACAACG TGGTCAAGTA TTAGCTGCTC 12420 

CTGGTTCAAT TACACCACAT ACTGAATTCA AAGCAGAAGT ATACGTATTA TCAAAAGACG 12480 

AAGGTGGACG TCACACTCCA TTCTTCTCAA ACTATCGTCC ACAATTCTAT TTCCGTACTA 12540 

CTGAGGTAAC TGGTGTTGTT CACTTACCAG AAGGTACTGA AATGGTAATG CCTGGTGATA 12600 

ACGTTGAAAT GACAGTAGAA TTAATCGCTC CAATCGCGAT TGAAGACGGT ACTCGTTTCT 12650 

CAATCCGTGA AGG7GGACGT ACTGTAGGAT CAGGCGTTGT TACTGAAATC ATTAAATAAT 1272 0 

TTCTAATTTC TTAGATTTTA TA7AAAAAGA AGATCCCTCA ATCGAGGGGt CTTTTTTTAA 12780 

TGTGTAAATT TTGTAATGGC TATTCGATTT AGAAGAACAA TAATTGATGA AAGACTGACT 1284 0 

AATAAAACTT ATAACTGATA ATACTGTTTA AATAAAATTG TTGAGTCTTG GACATTGTAA 12900 

AATGCTCCCT TCAAAGTTTT CATTTTTTCa ATGTCTACTT TGAAGGGAGC ATTTCATTAG 12960 

TTTATGTCTC AGATTCATAT CTTTCAATTA ATTTAAATGC TTAATTTGTT TTAAATACTT 13020 

GCTCTAATTC TATGATTTTT AAAAATACAG CTACAGCGTA TTTTAATGAT TTTTCATCAA 13080 
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TCAGAAAGAA 


TGCACCTGGT CGTACTTTCA AATAATGTGA AAAATCTTCT 


CCAATCATCA 


13200 




TTAAATCTGA TTCATTAAAG CGTACATGTA AGTCATTTGT TGCTTCTTTA ATAACTTGAT 


13260 


5 


ATGCTTTCTC GTTATTATGG ACAGGCAAAT ACCCTTTAAT ATAATTCAAA TCATAGTTAA 


13320 




TATCATTTGC TATTGCTAAA CCTTGTAGAA GCTTATCCAT 


TTTGTCCATT 


ACATGATTCT 


13380 


10 


GTATATCTGA ATCGAAAGTT CTAACTGTAC CTTTACAAAA TGCTTGATCA GGAATAACGC 


13440 


TATCTGTGGT 


GCCTGCTTGA ATCATTCCAA 


ATGAAAGTAC 


AGCTTGTTTA 


ACTGGATCGA 


13500 




TCGTACGTGA 


AATTATTITT TGTGCACTTA 


AAATGAACTC 


TGCCATGATT 


ACTATTGGGT 


13560 


1 c 
1o 


CAATGGTTTC 


ATGAGGTTTG GCACCATGAC 


CACCACGACC 


TTTAAATGTG 


ACGCTAAATT 


13620 


CATCTGGAGA 


GGCCATGATT GCCCCCGCAC 


GTGAATGAAT 


AGTTCCAGTA 


GGATAACCAC 


13680 




TCCATAAATG 


TGTACCGTAA ATTCTATCTA 


CATTTTCCAG 


ACATCCAGCA 


TCTATCATTT 


13740 


20 


CTTGAGAACC 


ACCTGGCATG ATTTCTTCAC 


CGTACTGGAA 


TATTAATACA 


ACATTACCTT 


13800 




CTAATAAATG 


TTTATGTTCA TCTAAAATCT 


CTGCTACAGT 


AAGTAAAATT 


GCTGTATGAC 


13860 




CATCATGCCC 


ACACGCATGC ATACATCCTG 


GATTTTTAGA 


CTTATAAGGC 


ACATCGTTTA 


13920 


25 


ATTCCTCGAC 


AGGTAACGCA TCAAAGTCAG 


CTCTTAATGC 


AATGGTAGGT 


CCTGTGCCCA 


13930 




AGCCTTTAAA 


TGTGGCTTTG ATACCATTGC 


GGCCGATAGG 


AGTTTCAATA 


TCACAAGATA 


14040 




ACTGGCTTAA 


TTGGTTAACA ATATAATCAT 


GTGTTTGAAA 


TTCTTCAAAA 


GATAACTCAG 


14100 


30 


GATATTGGTG 


TAAATAACGT CTGAGTTGAA 


TTGTTTTATT 


TTCTTTATTA 


TTTGCTAGTT 


14160 




GGAACCAATC 


TAACACCCTT ATCACTACTT 


TCTAAAATAA TGTTTATAGT ATAACATTTT 


14220 




ATGAAATTAT 


CGTACTAAAT GATTGCTTTG 


AGATATTTTA 


TCTATGAATG 


ATAAGGCTTT 


14230 


35 


CAAGTTATGT 


AGAATTACTG TATGATAAAG 


GTATTACCAA ACAATACTTA AGGGGGATTA 


14340 




TATACTGTGG 


TTCAATCATT ACATGAGTTT 


TTAGAGGAAA ATATAAATTA 


TCTAAAAGAA 


14400 


40 


AATGGTTTGT 


ATAATGAAAT AGATACAATT 


GAAGGTGCAA ACGGACCAGA AATCAAAATC 


14460 


AATGGGAAAT 


CATACATTAA CTTATCTTCA 


AATAATTATT 


TAGGACTAGC 


AACAAATGAA 


14520 




GATTTGAAAT 


CaGctGCAAA AG CAG CT ATT 


GATACACATG 


GTGTAGGTGC 


AGGCGCTGTT 


14580 




CGTACAATCA ATGGTACATT AGATTTACAC 


GACGAATTAG 


AAGAAACACT 


AGCAAAATTT 


14640 




AAAGGAACAG 


AAGCTGCAAT AGCTTATCAA 


TCAGGATTTA 


ATTGTAATAT 


GGCTGCTATT 


14700 




TCAGCTGTCA 


TGAATAAAAA TGATGCTATT 


TTATCAGATG 


AGCTTAATCA 


TGCATCAATT 


14760 


SO 


ATTGATGGAT 


GTCGCTTATC TAAAGCTAAA ATTATTCGAG 


TTAACCATTC 


AGACATGGAT 


14820 




GATTTACGTG 


CGAAAGCAAA AGAAGCAGTT 


GAATCAGGTC 


AATACAATAA 


AGTGATGTAT 


14880 
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ATTGCAGAAG AATTTGGTTT ATTAACTTAT GTTGACGACG CTCATGGTTC AGGTGTTATG 15000 

GGTAAAGGCG CTGGTACGGT TAAACATTTT GGTTTACAAG ATAAAATCGA TTTCCAAATA 15060 

GGTACGCTTT CTAAAGCAAT TGGTGTCGTT GGCGGTTATG TAGCAGGTAC AAAAGAGTTA 15120 

ATAGATTGGT TAAAAGCACA ATCACGACCA TTCTTATTCT CTACATCATT AGCACCTGGG 15180 

GATACCAAAG CAATAACTGA AGCAGTTAAA AAGTTAATGG ATTCAACTGA ATTACATGAT 15240 

AAATTATGGA ACAATGCACA ATATTTAAAA AATGGATTGT CAAAATTAGG ATATGATACA 15300 

GGTGAGTCAG AAACTCCAAT TACACCAGTA ATTATTGGTG ATGAAAAAAC AACTCAAGAA 15360 

TTTAGTAAGC GTTTAAAAGA CGAAGGTGTC TATGTGAAAT CTATCGTTTT CCCAACAGTA 15420 

CCAAGAGGTA CAGGACGTGT AAGAAATATG CCTACAGCTG CACATACAAA AGACATGTTA 15480 

GATGAAGCAA TTGCGGCTTA TGAAAAAGTA GGAAAAGAAA TGAAGTTGAT TTAATATTTA 15540 

TTTATTCCCA CGGCAAATAT TGTCGTGGGC TTTTTTTAAT GTTTAGTTTA TTAACAGT 15598 
(2) INFORMATION FOR SEQ ID NO: 83: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 661 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

AAGTAAATCA ACTTACTGGG ATAAGAATAA AGGCGATTAT AGTAACAAGT TGATTTTATT 60 

CGAAAAACAT TTTGAACCGG TTCTGGGTAT CAAGATGCAA CATAGTGGAG GTCATAGCTT 120 

TGGCCACACG ATTATTACGA TTGAAAGTCA AGGAGATAAA GCAGTTCATA TGGGTGATAT 180 

ATTCCCAACT ACTGCACATA AAAATCCTCT ATGGGTAACG GCATATGATG ATTATCCTAT 24 0 

GCAATCGATT CGTGAAAAAG AACGCATGAT ACCATATTTT ATTCAGCAAC AATATTGGTT 300 

CTTGTTTTAT CATGATGAAA ACTACTTTGC TGTAAAATAC AGCGATAATG GTGAAAACAT 360 

AGATGCATAT ATTTTACGTG AAACATTAGT TGATAATAAC TAAAATAAAG ATGTATTACT 420 

AAACAAATTT TCAAAAATAA AAAATTGAGC CACATCCAAT CTTACTAATT AGGGTGTGGC 480 

TCATTTTTAA GTTTTACgAT CCAAATCAAA TATGGaTAAA ATTCgTATTA ACGCTCTACa 540 

ATGtTAATGA CTTCACCAGT ATATGCATCT GCATAAAAAT CATAATGAAT ATTTTGACCA 600 

50 TTTTTAATAG TTGTAATTCC ACCTTGATAA ACTAAACGGT ATTTATCAGT TTCAGGATGA 660 

A 661 



55 



540 



EP0 786 519 A2 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

10 

GCAGACGGTA CAGCAGTTAA AGTCGCACCA AaACTGTAGT GAATcTAATC GGTGcATTCT 60 
TTTTAGGATT AGTTGTCGCG CTTATATATA TCTTCTTCAA AGTAATTTTC GATAAGCGAA 120 
TTAAAGATGA AGAAGATGTA GAGAAAGAAT TAGGATTGCC TGTATTGGGT TCAATTCAAA 180 

15 

AATTTAATTA AGGATGGTTG CTACTTATGT CAAAAAAGGA AAATACGACA ACAACACTAT 240 
TTGTATATGA AAAACCAAAA TCAACAATTA GTGAAAAGTT TCGAGGTATA CGTTCAAACA 300 

2Q TCATGTTTTC AAAAGCAAAT GGTGAAGTAA AGCGCTTATT GGTTACTTCT GAAAAGCCTG 3 50 

GTGCAGGTAA AAGTACAGTT GTATCGAATG TAGCGATTAC TTATGCACAA GCAGGCTATA 420 

AGACATTAGT TATTGATGGC GATATGCGTA AgcCAACACA AAACTATATT TTTAATGAGC 480 

25 AAAATAATAA TGGACTATCA AGCTTAATCA TTGGTCGAAC GACTATGTCA GAAGCAATTA 54 0 

CGTCGACAGA AATTGAAAAT TTAGATTTGC TAACAGCTGG CCCTGTACCT CCAAATCCAT 600 
CTGAGTTAAT TGGGTCTGAA AGGTTCAAAG AATTAGTTGA TCTGTTTAAT AAACGTTACG 6 60 

30 ACATTATTAT TGTCGATACA CCGCCAGTTA ATACTGTGAC TGATGCACAA CTATATGCGC 720 

GTGCTATTAA AGATAGTCTG TTAGTAATTG ATAGTGAAAA AAATGATAAr AATGAAGTTA 7 30 

AAAAAGCAAA AGCACTTATG GAAAAAGCAG GCAGTAACAT TCTAGGTGTC ATTTTGAACA 840 

35 

AGACAAAGGT CGATAAATCT TCTAGTTATT ATCACTATTA TGGAGATGAA TAAGTATGAT 900 
TGATATTCAT AACCATATAT TGCCTAATAT CGATGACGGT CCGACAAATG AAACAGAGAT 9 50 

GATGGATCTT TTAAAACAAG CGACAACACA AGGTGTTACA GAAATCATTG TAACATCACA 1020 

40 

TCACTTACAT CCTCGATATA CCACACCTAT AGAAAAAGTG AAATCATGTT TAAACCATAT 1080 

TGAAAGCTTA GAGGAAGTAC AAGCACTAAA TCTAAAGTTT TATTATGGTC AGGAAATAAG 1140 

45 AATTACCGAT CAAATCCTTA ATGATATTGA TCGAAAAGTT ATTAACGGTA TTAATGATTC 1200 

ACGCTATTTA CTAATAGAAT TTCCATCAAA TGAAGT7CCA CACTATACTG ATCAATTATt 1260 

TTTCGAATcA CAGAGTAAAG GCTTTGTACC GATTATTGCA CATCCAGAGC GGAATAAAGC 1320 

50 AATAAGTCAA AACCTTGACA TACTATACGA TTTAATTAAC AAAGGTGCTT TAAGTCAAGT 13 80 

GACAACGGcG TCATTAGCGG GTATTTCCGG TAAAAAAATT AGAAAATTAG CAATTCAAAT 14 40 
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GTTCTTAATG AAAGACTTAT TTAATGATAA GAAATTACGT GATTATTATG AAGATATGAA 15 50 

CGGATTTATT AGTAATGCGA AGTTAGTTGT TGATGATAAA AAAATTCCTA AACGAATGCC 1620 

5 ACAACAAGAT TATAAACAGA AAAGATGGTT TGGGTTATAA ACAGCAAATG AGGGGTTTTA 1680 

TGGCACATTT ATCTGTGAAA TTGCGGCTTT TAATACTAGC ATTAATCGAT TCACTGATAG 1740 

TGACATTTTC AGTATTCGTA AGTTATTACA TTTTAGAACC GTATTTCAAA ACATATTCTG 1800 

10 

TCAAATTATT AATATTGGCA GCTATATCAC TATTCATATC GCATCATATT TCaGCATTTA I960 

TTTTTAATAT GTATCATCGA GCGTGGGAAT ATGCCAGTGT GAGTGAATTG ATTTTAATTG 1920 

TTAAAGCTGT GACGACATCT ATCGTTATTA CGATGGTGGT CGTGACAATT GTTACAGGCA 1980 

15 

ATAGACCGTT TTTTAGATTG TATTTAATTA CTTGGATGAT GCACTTGATT TTAATAGGTG 2040 

GCTCAAGGTT ATTTTGGCGT ATTTATCGGA AATACCTTGG AGGTAAGTCA TTTAATAAGA 2100 

20 AGCCAACTTT AGTTGTTGGT GCTGGTCAAG CAGGTTCAAT GCTGATTAGA CAAATGTTGA 2160 

AAAGTGACGA AATGAAACTT GAACCGGTAT TAGCAGTCGA TGATGACGAA CATAAACGCA 2220 

ATATCACAAT TACTGAGGGT GTAAAAGTCC AAGGTAAAAT TGCGGATATT CCAGAACTAG 22 80 

25 TGAGGAAA7A TAAGATTAAA AAAATCATCA TTGCAATTCC AACTATTGGT CAAGAGCGTT 2340 

TGAAAGAAAT TAATAATATT TGCCATATGG ATGGCGTTGA GTTATTGAAA ATGCCAAATA 24 00 

TAGAAGACGT CATGTCTGGT GAGTTAGAAG TGAACCAACT TAAAAAAGTT GAAGTAGAAG 24 60 

30 ATTTACTAGG CAGAGATCCT GTTGAATTAG ATATGGATAT GATATCAAAT GAATTGACGA 2520 

ATAAAACTAT TTTAGTTACG GGTGCAGGTG GTTCAATAGG ATCAGAAATT TGTAGACAAG 2580 

TTTGTAATTT CTATCCAGAA CGTATTATTC TACTTGGCCA TGGTGAAAAC AGTATTTATT 2640 

35 

TAATCAATCG TGAATTGCGA AATCGCTTCG GwAAAAATGT TGATATCGTT C CT ATT AT AG 2700 

CGGATGTGCA AAATAGAGCG CGTATGTTTG AAATTATGGA AACGTATAAA CCATACGCAG 2 760 

TTTATCATGC AGCAGCACAC AAGCACGTGC CGTTAATGGA AGACAACCCT GAAGAAGCAG 2820 

40 

TACGTAATAA TATTTTAGGT ACGAAAAATA CTGCTGAAGC TGCTAAAAAT GCAGAGGTAA 2880 

AGAAATTCGT TATGATTTCT ACGGATAAAG CCGTTAATCC GCCTAATGTC ATGGGAGCTT 2940 

CAAAGCGAAT TGCAGAAATG ATTATTCAAA GTTTAAATGA TGAAACGCAT CGAACAAATT 30 00 

TTGTTGCAGT GAGATTTGGT AATGTACTTG GATCGAGAGG ATCTGTGATT CCACTTTTCA 3D 60 

AAAGTCAAAT TGAAGAAGGT GGGCCAGTTA CTGTGACACA TCCTGAAATG ACACGTTACT 3120 

50 TTATGACAAT TCCTGAAGCT TCTAGACTAG TTTTGCAGGC AGGGGCATTA GCAGAAGGTG 3180 

GCGAAGTATT TGTGCTAGAT ATGGGAGAAC CAGTGAAAAT TGTAGATTTG GCACGTAATT 3240 
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CCGGCGAAAA AATGTTTGAA GAGCTTATGA ATAAAGATGA GGTTCATCCT GAACAAGTAT 3360 

TTGAAAAAAT TTATCGTGGC AAAGTACAAC ATATGAAATG TAATGAAGTT GAAGCGATTA 3420 

TTCAAGACAT CGTCAATGAC TTTAGTAAAG AAAAAATTAT TAACTATGCC AATGGCAAAA 3480 

AGGGAGATAA TTATGTTCGA TGACAAAATT TTATTAATTA CTGGGGGCAC AGGATCATTC 3540 

GGTAATGCTG TTATGAAACA GTTTTTAGAT TCTAATATTA AAGAAATTCG TATTTTTTCA 3600 

CGCGATGAGA AAAAACAAGA TGACATTCGA AAAAAATATA ATAATTCAAA ATTAAAGTTC 3660 

TACATTGGTG ATGTGCGTGA TAGTCAAAGT GTAGAAACAG CAATGCGAGA TGTTGATTAC 3720 

GTATTCCATG CAGCAGCTTT AAAACAAGTG CCGTCATGTG AATTCTTTCC AGTTGAGGCA 37 BO 

GTGAAGACAA ATATTATTGG TACAGAAAAT GTCTTACAAA GTGCTATTCA TCAAAATGTT 3840 

AAAAAAGTCA TATGTTTATC TACAGATAAG GCAGCGTATC CTATTAATGC TAGGGGTATT 3900 

20 TCAAAAGCAA TGATGGAAAA AGTATTCGTA GCCAAATCAA GAAATATTCG TAGTGAACAA 3960 

ACGCTTATTT GTGGTACAAG ATACGGTAAT GTGATGGCTT CAAGAGGATC AGTAATACCT 4020 

TTGTTTATCG ACAAAATCAA AGCTGGAGAA CCTTTAACGA TTACAGATCC TGATATGACA 4080 

25 AGATTTTTAA TGAGCTTAGA AGATGCGGTA GAACTAGTTG TTCATGCATT TAAGCATGCA 4140 

GAGACAGGAG ATATTATGGT TCAAAAAGCA CCAAGCTCAA CGGTAGGGGA TCTTGCGACC 4200 

GCATTATTAG AATTGTTTGA AGCTGATAAT GCAATTGAAA TCATTGGTAC GCGACATGGA 4 260 

GAGAAAAAAG CAGAAACATT GTTGACGAGA GAAGAATACG CACAATGTGA AGATATGGGT 4320 

GATTATTTTA GAGTGCCGGC AGACTCCAGA GATTTAAATT ATAGTAATTA TGTTGAAACC 4380 

GGTAACGAAA AGATTACGCA ATCTTATGAA TATAACTCCG ATAATACACA TATTTTAACG 4440 

GTGGAAGAGA TAAAAGAAAA ACTTTTAACA CTAGAATATG TTAGAAACGA ATTGAATGAT 4 500 

TATAAAGCTT CAATGAGATA GGAGAGATTG ACGTTGAATA TTGTAATTAC AGGAGCAAAA 4560 

GGTTTTGTAG GAAAAAACTT GAAAGCAGAT TTAACTTCAA CGACAGATCA TCATATTTTC 4620 

GAAGTACATC GACAAACTAA AGAGGAAGAA TTAGAGTCAG CATTGTTGAA AGCAGACTTT 4680 

GTCGTGCATT TAGCGGGTGT TAATCGACCT GAACATGACA AAGAATTCAG CTTAGGAAAC 4740 

45 GTGAGTTATT TAGATCATGT ACTTGATATA TTAACTAGAA ATACGAAAAA GCCAGCGATA 4800 

TTATTATCGT CTTCAATACA AGCAACACAA GATAATCCTT ATGGTGAGAG TAAGTTGCAA 4360 

GGGGAACAGC TATTAAGAGA GTATGCCGAA GAGTATGGCA ATACGGTTTA TATTTATCGC 4920 

50 TGGCCAAATT TATTCGGCAA GTGGTGTAAG CCGAATTATA ACTCAGTGAT AGCAACATTT 4 930 

TGTTACAAAA TTGCACGTAA CGAAGAGATT CAAGTTAATG ATCGGAATGT TGAACTAACG 504 0 
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ATTGAAAATG GTGTACCTAC AGTACCAAAC GTATTTAAAG TGACATTGGG AGAAATTGTA 
GATTTAT7AT ACAAGTTCAA ACAGTCACGT CTCGATCGAA CATTGCCGAA ATTAGATAAC 
TTGTTTGAAA AAGATTTGTA TAGTACGTAT TTAAGCTATC TACCTAGTAC aGACTTTAGT 
TAyCCCTTAC TTATGAATGT GGATGATAGG GGTTCTTTTA CAGAATTTAT AAAAACACCG 
GATCGTGGTC AAGTTTCTGT AAATATTTCT AAACCAGGTA TTACTAAAGG TAATCACTGG 
CATCATACTA AAAACGAAAA ATTTCTAGTC GTATCAGGTA AAGGGGTAAT TCGTTTTAGA 
CATGTTAATG ATGATGAAAT CATTGAATAT TATGTTTCTG GCGACAAATT AGAAGTTGTA 
GACATACCAG TAGGATACAC ACATAATATT GAAAATTTAG GCGACACAGA TATGGTAACT 
ATTATGTGGG TGAATGAAAT GTTTGATCCA AATCAGCCAG ATACGTATTT CTTGGAGGTA 
TAGCGCATGG aAAAACTGAA rTTAATGACA ATAGTTGGTA CAAGGCCTGA AATCATTCGT 
TTATCATCAA CGATTAAAGC ATGTGATCAA TATtTTAA 
(2) INFORMATION FOR SSQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9062 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



5160 
5220 
5280 
5340 
5400 
5460 
5520 
5590 
5640 
5700 
5733 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 : 
ATCATCAACA AGAATGATAT TTTTCCCATC TACTATATCT TTTACCGCAG ATAACTTCAC 
TCTCACACCT TGCTCACGTA ATTCTTGAGT TGGTTGAATA AATGTTCTTG CAACATATTG 
ATTTTTAACT AGTCCCATTT CATATGGCAA ACCTATTTCT TCAGCATAAC CACTCGCAGC 
TGATAGCGAT gAATTGGGTA CACCGATGAC CATATCAGCA TTTACAGGGC TTTCTTGGGC 
TAATTTTTTA CCAGAAGCTT TACGTACTGC ATGGACATTT TTACCAGCTA TTGTTGAGTC 
TGGTCTAGCA AAATAAATAT ATTCCATCGC AGAAATTGCA GTTGTCGTAT GATGTGTATA 
AGATTTAACT GTAATACCTT TATCGTTAAT CACGACATAT TCACCTGCAT GAATATCTTG 
AACAAATTCT GCACCTAACA CATCTATTGC ACATGTTTCA CTTGCAAGGA TGTATGTCCC 
ATCTTTCATT TTACCTACAA CAAGTGGTCT GATAGCATTT GGATCTACTG CGCCATATAA 
CGCATCTTTA GTTAAAATCG CAAATGTAAA ACCGCCTTTA ACTTTTCGCA AACTTTCTTT 
CAACGCTTCC TCAAAAGTAG GAGCTTTACT TCGACGTATC AAATGCATAA TGACTTCAGT 
ATCAGAAGAC GAATGGAAGA TAGCACCTTG TTTTTCTAAA TTCTGACGCA ATGATTTAGC 
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350 
420 
480 
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600 
660 
720 



55 



544 



EP 0 786 519 A2 

CGGTTGAATA TTTTCAATAC CTTTATTACC TGAAGTAGCA TAACGGACGT GACCAATTGC 84 0 

ATGTTGATAT CCTTTTAATC GTTCCATTTG ATCATCTTTA ATCGCTTCAG TTAGTAAGCC 900 

5 TAATCCTCGC TCGCCTTTTA ATTCATTTTG ATCAGAAACA ACT AT AC CTG cACCTTCTTG 960 

ACCACGATGT TGCAAACTAT GAAGTCCCAT ATAtGTTAGT TGCGCTGCtT CaGGATGATT 1020 

CCAAATACCA AACACGCCAC ATTCTTCGTT TAATCCTGAG TAGTTAAACA TTGaGCAATT 1090 

10 

GCCCCtTCCC ATATTTGTTT AATATCTGAA ACATTTTCAC TAATCTCTGT aTATGGTGTT 114 0 

GTTACCTTGr aATTATCACT ATCTGTTAAA AGTCCAATTT CTATTGCATT ATCAATATTT 1200 

AAAGTTTTAC CTGATTTAAC AGAAACAACA TATCGGCCTT GCGTCTCACT AAACAATTGT 1260 

15 

GCATTTGTTA TATCTATTGA AGATTTTAAT CCTAAACCGT AATGCGCACT TAGTTTAGCT 1320 

AAGGTAATCA GTAAGCCACC TTTACCAACT GTTTGAACAT GTGATAATAG TCCTTCACGA 1380 

2Q ATAG CGGTCT TGATTGATTC ACCTTTTTCA ACTTCTGAAC TCAAATCTAA TGACTCAAAT 1440 

TCATGATTAA CTTTGCCATA AATTAACTTT TCAAGTTGAC TACCACCAAA GTCGTCCTTA 1500 

GTATCACCGA TTAAATATAA TTTATCTCCA ACTTGAGGTT CAAAATCATT TAAATAATTT 156 0 

25 ACATTTTCAA TCAAACCTAC CATTCCAACA ACTGGTGTTG GGAAAATAGA AGTACCTTTC 1620 

GTTTCGTTAT ATAAAGATAC ATTACCAGAA ACTACTGGTG TCTTAAGAAT GTCGCATGCT 168 0 

TCTGCCATAC CTTTCGTTGA ATCTATCAAC TGTTGATAGA TTTCTTTCTT TTCAGGAGAA 174 0 

30 CCATAATTTA AACAATCTGT CATTGCTAAT GGTGTTGCAC CCACGGCAAT TAAATTTCGA 1800 

TAAGCTTCAG CTACTACCAT CTTTCCACCT TCATATGGAT TGTTATATAC ATAACGCGCT 186 0 

TCACCATCAA TTGTTGAAGC AATTGCCTTA TTTGTGCCTT CCACACGTAC TACCGATGCT 192 0 

35 

TGAAGTCCTG GCTTAATTAT CGTATTGGCA CCAACTTGTT GGTCGTATTG ATCATATAAA 1980 

TAGTGTTTAG ATGCTATAGT CGGATGCTTA AGTAATTTAA AGAAAGTATC TTTAACATCG 2040 

ATGTGTGTAT AATCATTTTT AGAAGTATTA TAATCTTTTT CTTCTCCTTC TAAAATATAT 2100 

40 

ACAGGTGCTT CATCAGCTAG TGGTTCAACT GGAATGTCAG CATAAACTTC GTCATCATAT 2160 

GTTAAAACAA AACGATTTGT ATCTGTAACT TCACCTATAA CAGCACTATC CAATTCGTGC 2220 

TTATCAAATA AATCTAAGAA TTTTTGTTCA GTACCTTTTT CAACAACTAG TAACATACGT 2280 

4o 

TCTTGAGTTT CTGAAAGCAT CATTTCATAA GGAGAAATAC CTGGCTCACG TGTTGGCACT 2340 

TGTTCTAATC TCAAATGTAA CCCACTACCA CCTTTTGCCG CCATTTCAGA CGATGAAGAT 2400 

SO GTTAAACCAG CAGCACCCAT ATCTTGAATA CCAACTAATT CATCAAATGT AATTGCTTCA 2460 

AGTGTTGCTT CCATTAATTT TTTACCTACA AATGGATCAC CGATTTGTAC AGAAGGTCGT 2520 
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CGACCAGTTT TGAAACCAAC ATAAATGACC GAATTACCTA CACCTTTTGC TGTGCCTTTT 2 64 0 

TGAATCATGT CGTGATTGaT AACACCAACA CACATTGCA7 TAACAAGTGG ATTGCCATCA 2700 

TAACGTTCAT CAAATTCGAT TTCACCAGCA GTTGTTGGaA TACCAATGCA GTTACCATAA 2760 

CCTCCGATAC CCTTTACAAC ACCTTTAAGT AATCTTTGGT TTTGTTTATT ATCTAATTCT 2 320 

CCAAATCTAA GACTGTTTAA CAAATTAATA GGTCTAGCCC CAATAGAGAC AATGTCACGA 28 80 

ATGATTCCAC CAACGCCTGT AGCAGCCCCT TGATATGGTT CAATTGCTGA TGGATGATTG 2940 

TGAGACTCTA CTTTAAATAC TACGG CTTGA TTATCACCTA TATCGACTAC CCCTGCACCT 3000 

TCACCAGGCC CCATAAGCAC ATGGTcACCT GACGTAGGAA ATTGCTTTAA AAACGGTTTA 3 060 

GAATGTTTAT AAGAGCAATG TTCACTCCAC ATAAGAGAAA AGATACCTGT TTCTGTAAAG 3120 

TTAGGTTGTC TGCCTAAAAT ATCGCAAACT TTTTCATATT CTTGATCaCT TAATCCCATA 3180 

TCTTGATATA CTTTTTCAAG TTTAATTTCT TCAACGCTTG GTTCGATAAA TTTAGACATG 3 240 

TTGTTCCCTC CAACTTTTTA CCATCGCTTC AAATAATTTC ACACCACTAT CAGTACCTAA 3300 

CAACGTTTCT AAAGCTCTTT CagGATGtGG CATCATGCCA CATACATTGC CTTTTTCGTT 3360 

25 AACAATTCCT GCAATATCAT CATATGAACC GTTCGGATTA TTCACATATT TCAGAATAAT 3420 

TTGATTGTTA GCTTTTAATT GTTGATATAT TTCATCAGTA CAATAATAAT GACCTTCACC 34 80 

GTGAGCTACA GGATATATAA CTTTTTCACC TTGTTCATAA AGATTTGTAA ATGCCGTTTG 3 540 

30 ATTATTCACT ATTTCTAACT CTTCATTTCT ACTAATAAAT AAATGTGAAT CGTTATGCAA 3 600 

TAATGCACCA GGTAATAAGC CTATTTCAGT TAAAATTTGA AACCCATTAC AAACACCTAA 3660 

TACTGGCTTA CCTTCAGCTG CAAGACGTTT AACTTCCGAA ATAATCGGsG CTACACTAGC 3 720 

CATTGCCCCA GATCTTAAGT AATCCCCGAA TGAAAATCCA CCAGGAATAA GTACGCCATC 3780 

AAATSCACTT AGTGATGTTT CTCTATAATC TACATATTCC GCTTCAACAC CACTTTTAAT 3840 

AGCAGCATTA AACATGTCTC TATCACAATT CGAACCTGGA AAAACAAGAA CCGCAAATTT 3900 

CATTTTATGC ATTCTCCTTT TCATCATCTA ACACTTTATA GCTATATTCT TCAATCACTG 3960 

TAT7TGCAAA CAATTTTTCA CTTAGAGTTG TAATAATGTT GTGTACCTTT TCATCACTAA 4020 

CCTCATCCAC TGTCATATAT AATACTTTTC CTACACGAAT ATCATTCACT TGTGCATAAC 4080 

CTAAGTCATG TACAGCTCGA GTAAGCGTTT GTCCTTGCGT ATCTAATACT TGTGGTTGTA 4140 

ATGTGATATG TAGTTCAATT GTTTTCATTA TTTTAAATCC TCCAATTTGT TTAAAAATAT 4200 

50 TTGATATGTT TCAATCAGTG ATCCAGTGTT ATTTCTATAT ACATCTTTAT CAAAGTTTGC 4260 

ATTGGTAGCT TTATCCCAAA TTCGACATGT ATCTGGAGAT ATTTCATCCG CTAACAAAAT 4320 
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ATCCATTAAT 


TGTTTCAACA 


CATTATTAAT 


CITTAATGCT 


TTGGATTTTA 


GTATTTCAAT 


4440 




ATCTTCATCT 


GATGCTATAT 


TGAGCAATTT 


AACATGGTCA 


TCCGTTATCA 


ACGGATCATT 


4500 


5 


TAACGCATCA 


TTTTTATAGA 


AAAATTCTAC 


AAGTGGTTCT 


CTAAAAACTT 


CACCATTT7C 


4560 




AAAACCTAAA CGCTTTGTAA TAGATCCACT AGCAATATTA CGAACAACTA CTTCTAATGG 


4620 


10 


AATTATTTTC 


ACAGGCTTAA 


CTAATTGTTC 


TGTTTCAGAT 


AATTGTTTAA 


TAAAGTGACT 


4680 


TTCTATTCCA 


TTTTCTTGTA 


AATATTTAAA 


TATAATAGAA 


GTAATTTGAT 


TAT1TAATCG 


4740 




CCCCTTACCT 


GCCATTGTGT 


CTTTCTTAGC 


CCCGTTTCCA 


GCAGTAACTT 


CATCTTTATA 


4800 


75 


TTCAACTCTT 


AATTCATTTT 


CTTGATTTGT 


TGAGAAAATG 


CGcTTCGCTT 


TTCCTTCATA 


4860 


TAATAATGTC 


ATGCTTTAAT 


TACTCCCCTC 


AAATTTAGCG 


TACATATCTT 


GTTCAGTTTG 


4920 




GTTTACATCA 


TTCGTTAGTA 


CAGTCATATG 


CCCCATTTTT 


CTGCTATCTT 


TACGCTCAGA 


4980 


20 


CTTACCATAA 


ATATGTAAGT 


GCCACTCTGG 


ATGTTCATTA 


AATTCATrJLT 


CCAATAAATC 


5040 




TAAATCTTTA 


CCTAGTAAGT 


TCATCATGAC 


TGCTGGCTTT 


AATAATTCAA 


TTGAATTTGG 


5100 




TAATGATTGT 


CCGGTAACTG 


CTAAAATATG 


AGTATCAAAT 


TGTGAATAAT 


CACATGCTTC 


5160 


25 


AATTGAATAA 


TGTCCGGAAT 


TGTGAGGCCT 


TGGTGCTATC 


TCGTTCACAT 


ACAATTGGTT 


5220 




GTTACTATCT 


ATAAAAAATT 


CAACTGTAAA 


TGTTCCAATG 


AAATGAATCG 


ATTGGATAAT 


5280 




TTTATTAACT 


TGCTCTTTCG 


CCTCAGCTGT 


TTTATCTATT 


CTCGCTGGAA 


UAA I i VjT ill 


5340 


30 


GAAAAGTATT 


TGATTTCTAT 


GCTCATTTTC 


TTGTAATGGG 


AAAAAAGTGA 


TTTGATTGTT 


5400 




GTTTCCTCTT 


GTAACAGTAA 


GAGATACTTC 


TTTCTTGATA 


TTCAAATATT 


TTTCAGCTAC 


5460 




GCATTCACTA 


GTTTCAATTA 


ATTTAAAACC 


TTCTTGTAAG 


TCTTTTTCGT 


TGTTAATTAA 


5520 


35 


AACTTGACCT 


TTGCCATCGT 


AGCCACCAAA 


TCTAGTTTTT 


ACAATAAAAG 


GATATCCTAA 


5580 




TGTT^CAATT GCTTTGTCAA TATCTGTAGA TTCTTTTACT GAAATGAACG 

• 


GGACAACTTT 


5S40 


40 


GGTACCAGCA 


CTTTTTAATG 


TTTCTTTTTC 


AGTTAAGCGA 


TCTTGTAATA 


ACTGTATAGC 


5700 


TTGGTAACCT 
GTTTTCAAAT 


TGCGGAATAT 
TCATAAGTAA 


TGTACTTTTC 
TCACATCACA 


ACATAATAGT 


TTTAATTGTT 


GGGCTGAAAT 


5760 




TTTTTGTCCT 


AATTGATTGA 


GTGCCTTTTC 


5820 


45 


ATCGTCATAC 


TTGGCTT3TA 


TAAATTCGTG 


TGCAACGTAT 


CTACATGGAC 


AATCTTCAGA 


5880 




AGGATCCAAT 


ACAACCACTT 


TATAACCCAT 


TTTTTGAGCT 


GATTGTGCCA 


TCATCTTTCC 


5940 




AAGCTGACCA 


CCACCAATAA 


TGCCAATAGT 


CGCACCAAAC 


TTTAATTTAT 


TGAAGTTCAT 


6000 


SO 


TTTGCATGTC 


CTCCACTTTT 


TGAATTAACG 


AAGATTCATA 


CTGATTTAGT 


TTTTCAACTA 


6060 




AAGAAGGATT 


TTGAATACTT 


AACATTCTTG 


CTGCAAGTAT 


ACCTGCGTTT 


TTAGCACCTG 


6120 
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AAGAATCTAT ACCCTTTAAA CTTTTTGTTT CAATCGGCAC TCCAATAACT GGTAGCGTCG 624 0 

TTAATGATGC AACCATACCT GGTAAATGTG CCGCACCGCC AGCGCCTGCA ATGATAATGT 6300 

TTATACCTCT TTCTCTCGCT TCAGAAGCAA ATTGAACCAT CATTTTTGGC GTACGATGTG 6360 

CGGATACTAC TTGTTTTTCG TACGGAATTT CAAAATAATC CAACATGTTA CAACTCTCTT 6420 

GCATAATTTT CCAATCGGAA GAACTGCCCA TAATGACTGC TACTTTCACT TTGTACACCC 6480 

TTTCAAAAGT TTGAATTGTG AATTACTTTA GTTGTATATT ATAGATATAG CATAACAAGC 6540 

AATTTCTGCT TTTTCAATCA AAAATCGAAC TTTATTTTGA TTTTTTATTT GAATTTACGT 6600 

CTTTTGCTAT GTAAATTAGT TTTATAAACT AACAAAGTTA GGATATTGAC AATAGGAGGA 6660 

GAAGTTTTTA TGGTTGCTAA AATTTTAGAT GGTAAACAAA TTGCCAAAGA CTACAGACAG 6720 

GGGTTACAAG ATCAAGTTGA AGCGCTAAAA GAAAAGGGTT TTACACCTAA ATTATCCGTT 6780 

ATATTAGTTG GTAATGATGG CGCTAGTCAA AGTTATGTTA GATCAAAAAA GAAAGCAGCT 684 0 

GAAAAAATTG GTATGATTTc AGAAATCGTA CATTTGGAAG AAACAGCTAC TGAAGAAGAA 6900 

GTATTAAACG AACTAAATAG ACTAAATAAT GATGATTCTG TAAGTGGTAT TTTGGTACAA 696 0 

25 GTACCATTAC CAAAACAAGT TAGCGAACAG AAAATATTAG AAGCAATCAA TCCTGAAAAA .7020 

GATGTGGACG GTTTTCATCC AATAAATATA GGGAAATT AT ATATCGATGA ACAAACTTTT 7080 

GTACCTTGCA CACCGCTCGG CATCATGGAA ATATTAAAAC ATGCTGATAT TGATTTAGAA 7140 

30 GGTAAAAATG CAGTTGTAAT TGGACGAAGT CATATTGTCG GACAACCAGT TTCTAAGTTA 7200 

CTACTTCAAA AAAATGCATC AGTAACAATC TTACATTCTC GTTCAAAAGA TATGGCATCA 7260 

TATTTAAAAG ATGCTGATGT CATTGTCAGT GCAGTTGGTA AGCCTGGTTT AGTAACAAAA 7320 

GATGTGGTCA AAGAAGGAGC AGTAATTATC GATGTTGGCA ATACGCCAGA TGAAAATGGC 73 80 

AAAXTAAAAG GTGACGTTGA TTATGATGCG GTTAAAGAAA TTGCTGGAGC TATTACACCA 7440 

GTTCCTGGTG GCGTTGGTCC ATTAACAATT ACTATGGTAT TAAATAATAC TTTGCTTGCA 7500 

GAAAAAATGC GTCGAGGTAT TGATTCGTAA AGAGCCTGAG ACATAAATCA ATGTTCTATG 7560 

CTCTACAAAG TTATAATGGC AGTAGTTGAC TGAACGAAAA TTCGCTTGTA ACAAGCTTTT 7620 

TTCAATTCTA GTCAACCTTG CCGGGGTGGG ACGACGAAAT AAATTTTACG AAAATATCAT 7680 

TTCTGTCCCA CTCCCTAATA ACTGAGTTTT AATGAAGTCT TTTAACCCAC ATTAAATATT 7740 

ATTTTGCAAT TGCAATGAAT AACAAGAAAA ATCTGGGACA TTAATCGATC AAATGCTCCC 7300 

SO TTCAAAGTAG ACATTGAATA AATGAAGGCT TTGAAGGGAG CATTTCACTT TGTACTTGGC 7860 

TCAACAATTT TATATAGACA GTAGTTAATT GAATGAAAAT AAGCTTGTAA CAAGTTTTCA 7920 
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GTTGGGGATG 


GGCCCCAACA 


CAGAAGCTGT 


GACTATGATA 


AAGTACTACT 


ACATAGTTAA 


8040 




TCATTAGTGG 


TTCTTTATCA 


TTTTCGCCTC 


CCTTTTCTTA 


TTGTTTTGAT 


ACACAAAAAT 


8100 


5 


TTAAGTTCAA 


ACTGTCGAAT 


AAAGTTATAT 


TTGATTTCAA 


ATTATCCCTA AATTATTAAT 


8160 




TkTACAATTG 


TGGCAGATTT 


TCAAAATAAT 


AATTATTTCC 


TCATTATTTA 


TAAATTTATA 


8220 


10 


TTTAAATTTC 


ATTCTTTATA 


GGGTAAGATT 


AGGACTATAG 


TATGATGTGT 


ArATAATATA 


8280 


AATTAAGGTA 


TAGTAAAGCT 


AACTCAGAAA 


TGACTTATCA 


TTCGGAGGTT 


ACATTATGAA 


8340 




TAAACTATTA 


CAGTCATTAT 


CAGCCCTCGG 


TGTTTCTGCT 


ACACTAGTAA 


CACCAAATTT 


8400 




AAATGCAGAT 


GCAACGACGA 


ATACTACACC 


ACAAATTAAA 


GGCGCTAATG 


ATATCGTTAT 


8460 




TAAGAAAGGT 


CAAG ATTATA 


ACCTTCTAAA 


CGGCATAAGT 


GCATTTGATA 


AAGAAGATGG 


8520 




AGATTTAACC 


GATAAAATTA 


AAGTCGATGG 


CCAAATTGAT 


ACATCTAAAT 


CTGGTAAATA 


8580 


20 


TCAAATTAAA 


TATCATGTCA 


CTGATTCAGA 


TGGTGCAATT 


AAAATTTCCA 


CTAGGTATAT 


8640 




TGAGGTTAAA 


TAGCCCTCAT 


CACTATACTG 


CAAATAAAAT 


GGTAGCAAAC 


GAACATGTTT 


8700 




TGCTACCATT 


TTATTTGTTA 


TTCTAACTTC 


ATCTGCAACT 


TTAACCCAAA 


TATTGTATTT 


8760 


25 


irr v, -i >j i a l a 




APPTATCAAA 


TTATTAAAAC 


TTAACTGCTC 


TTTTTAAAAA 


8820 

m+ v 




AATGTTTTGA 


TTTTGAACAA 


ACAAATTTCC 


ACTTTTCATT 


GTTTAACGAT 


AAATTACTTT 


8880 




TGGCAAATTC 


CTTATTAAAA 


TGTTTGCGCT 


TCCTTTCAA7 


CAACTAGCCA 


TCATTTTCAA 


8940 


30 


TTTATTAGAC 


AATTTCAAAC 


TTTTTTTATT 


TTCATTCAAT 


TAACCTTTAA 


TTGAAAGCTA 


9000 




TTCTCAACTT 


TCCTTTTAAA 


TATGAAGCAA 


TTTTTTCAAA 


AACGCTATTA 


GTCACAAAAT 


9060 



QX 9062 

35 

(2) INFORMATION FOR SEQ ID NO: 86: 

Mi) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 2738 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDSDNESS : double 

(D) TOPOLOGY: linear 

45 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

AAATATTTTT TCAAAACTAT GTGAAAATGG aCCATGTCCA aATCATGTAA TAATGCAGyA 60 

CATAATGCCA ACGGTCTmTC TTTATTGTCC CATGCATCAT GACCAATAAA TGACTCATCA 120 

50 ATTAATCGTC TAACTATTTC ATACACACCT AAAGAATGTC CAAAGCGACT ATGTTCTGCT 180 

GTGTGAAAAG ATAGGTACAG TGTTCCTAGT TGTCTAATTC GACGTAACCT TTGGAATTCC 240 

55 



549 



10 



15 



20 



EP 0 786 519 A2 

TCTTTAAAAA CTTTTTCTTC TACTAATTTT AAATCTACAT ATGCGTTAGT CATTATTCCC 360 

CTCCTTTTCG TTTAATATAA TATTTAATTT ACTTAAAATG CTTTGTACAT AAGTGCTAAG 42 0 

TCTAACTTTT CGCCATACAT TTCTGGCTCA TAAGAGCGTA AGATTGTAAA ACCTTGCTCT 480 

TTATAGTAAG CTACTGCTTC TTCATTTTTA TTATCTACTT CTAAGTAAAC ACCTTCAAAT 54 0 

TTATCTTCAA AACGTGATAA TCCTTCATTT AACAATGCTG TACCATAACC TGTATGTTGC 6 00 

GATTCTGGTT TAACATAATG AGCTGATAAA TATAATTCTT CACCGTAAAT AAAGTTAGCA 66 0 

AAGCCAACGA TGTCATTACC TTCTTCAACG ACTAAGAATA ATTGTTCTTG AAGTCTTTTC 72 0 

TTTAAATGAT GTTCATTATA TGAAGCTtCT AACAAGTGAT TAACTGTTGT CGCAGCGTAT 7 80 

ATATTTAAGT ATGTATTAAA CCAAGCTTTA GTTGCGACAT CTCTAATTTG AACAACATCT 640 

TTTTCAGTTG CTTGTCTTAC CTTGAACATG ACTTTCTCCC CTTATTAACA AGTTTTAATA 900 

ACGGCATTAT ACCACAACTT GCTCAATACT TAATAAACAA TGATTGTCTA TTCAATTTAT 9 60 

ATATtTATAT TTTCCGTTAA AATTAAAAAT AAAAAATAAC GAAGCAAAAA AtCACTTCGT 1020 

TTAGTATGAG GTATGTCTTA TTGCAATATA CTATTCCACT CAGTTGCACG TGCTAAGGCA 10 80 

25 TAGTTGTCTT TCATGATGTC ACCAGGCTTT TCAGCAGTTC CAATAATATA ACCATTTAAA 1140 

GTGGCACCTA rAAAGTCTAA ACTATATTTC ATTTGCGTAA TTGCTGGTTC GCTTTTATTT 1200 

TTGGACAATC TCCACCAACT AAAATAACTC TAAAATCCTT TTCGGCCATT TGTGCCTTAA 12 60 

30 AATTAGGATA TCGTTTATCT TGTAATGTTT CTGACCAATG TTCGATAAAT GCTTTCAATG 1320 

GTGCTGAAAT GCTATACCAA TACACTGGTG ATGCAAAAAT AATTGTATCA CTAGCCAATA 1380 

TTTTATCTAG AATCGGCAAA TAGTCATCGT CATATGAAGT AATAGTCTCT GCTGTATGTC 144 0 

TCACGTCACG TATCGGTTTA AACTGATGTT GTGTCACGTC AATCCATTGA TACTCTAAAT 1500 

CTTGCAAAGC GAATTTTGTT AATTGTGCAG TATTACCGTT TGGTCTACTC CCACCAAACA 1560 

AAACAGTAAT CATTTTAGCC TAACCTCACT TTTGATTAAT AAATATCTGT GTTTTTCGTT 1620 

ACCTAATTAT ACTATCATAA GCTTTGCCTA CCGAATAGTA AAACGCTTAC AACTTTTATA 16 80 

TAAATTTGAC GAAATTTCGT CATGCCTTAT ATAACGTCGT TTGTGATACG GGGCTAATTC 174 0 

ATGATGAAAT TAGATACATA TATCACCATT AAATACAATT CATTTAGTCT TCAATCGGAA 1800 

ACAGTTCATC GATATATTGA ATCTCATCAT CTGATAAAAC GATATCTGCA GCTTTAATAT I860 

TITCAACGAC TTGTTCTGCA CGTTTTGCAC CAGGAATAAT CACATCGATA GCTGGTCTCG 192 0 

TTAAATAAAA TGCTAATACA ATGTTCGCAA TTGAAGTTTG ATGTGCTGCA GCTATGCTTT 1980 

CCAAAGCTTT TACGCGACGC ACATTTTCTT CAAATACACC TGGTTTAAAA TCACGACGTG 2040 
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GCTAATGGGA AATATGGAAT AAATGTGATT TGGTGATCAA CACAATATTG TAATACTGCC 2160 

TCATTTTCGC GATGCAATAA ATTATATTCT AACTGTACAA CATCAACGTA ACCATCTTTA 2220 

5 TTTGCTTCTT TAAGTTGATC TAATGTGAAA TTTGATACAC CAATTGCTTT AATCTTCCCT 2280 

TGTTCCTTAA GCTCTTGTAA TGCTGCAACT GCTTGATCTT TCGGAGTGTT GTTATCCGGA 23 4 0 

AAATGAATAT AATATAAATC GATATAATCA GTTTGTAGAC GTTTCAAACT ATTCTCAACT 2400 

10 

TGTTGTTTTA AATATTCCGG TTGATTGTTC TGATGTACTT CTTGATTTTC ATCAAATTCA 24 60 

TGAGACCCTT TCGTAGCAAT TTTAATTTGC TCTCGCGGAT ATTCTTTAAC AACTTCTCCA 2520 

ACCAATTCTT CTGATCGTTC TGGCCCATAA ATATATGCCG TATCTAATAA ATTAATACCA 2580 

75 

TGATTAATGG CTTGACGAAC AACATCTTTT CCTTGTTCTT CATCTAAGTT CGGATATAAA 2640 

TTATGCCCAa CCTAtGCGTT CGTCCCAAGT GCGATTGGAA ACACTTCAAC ATCAGATTTA 2700 

2Q CCTAAGTTTA CAAATTGCTn CATTAGACCC AGCnCCTT 273 8 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 25 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{D) TOPOLOGY: linear 



J0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

GATTAGATGA TATTTAACGA AAATTAaGrT GmAATACTtG AATGTArGAa GTCTGATGTC 60 

GAAAATAGCT ATTAAAATAG AGTAGACGTA ATGtAAATGA AAGCACCTAA AATAGAAAAA 120 

35 

TTTCAAAAAT AGCGTAATTA TTATAATAAA TAGACTGCCA ATAAAATGCA ATTTTTCACT 180 

TATAACATTC TTCAAAAAAT AATAGCAAAA TTATGTAAAA AATATCTTGT CATGGCAAGA 240 

TTGGCTGTGC TATAATCTAT CTTGTGCTTA AGAACGGCTC CTTGGTCAAG CGGTTAAGAC 300 

40 

ACCGCCCTTT CACGGCGGTA ACACGGGTTC GAGTCCCGTA GGAGTCACCA TTTTTTAGGT 360 

CTCGTAGTGT AGCGGTTAAC ACGCCTGCCT GTCACGCAGG AGATCGCGGG TTCGATTCCC 420 

GTCGAGACCG TACAAATGCC TATCCAAGAG GATAGGCATT TTTTTGCGTT TAATATTATA 480 

45 

TTAATAAAAG ATATATGGAC GAATGATAAT CATATTGATT TATCTGTTCG TCCATTTTCT 540 

TTAAAATGTA TGAACCTCAA GTAACTTAGT GGTTGGATAT GAAAGATAAA CGTAGACAAT 600 

50 AAAATCTTTA TTAGACGTAC AAACATATGC TACTGTCAAC ATATTTCTTC GTTGTGATAT 660 

GCCACCAGTC CTCCATAACA TCAATTGTTA AAGTAACGAA TAACGAATAA TGATATTTAT 72 0 
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GACCTCATCA 


TTGTGTTAAA 


TATCATTGTC 


ACAATCCGCC 


GTGAGAAACT 


AATAAAAAAT 


640 




AGTAATATAT AAGTTTATAT . 


TGGAAAATAG 


AATTAATAGC 


TTATAAATGG 


TAAATTATAT 


900 


5 


AATAGGTTAC 


TATACGTTAT 


AAGACGGAAA 


ATGCGCACAA 


TAACAAAAAT 


AGTAAGCGAC 


960 




ATCCTGTGAT 


TTTTTACACA 


AACATAAACG 


ATAAAGAACA 


AAAAATGATA 


AAATAATATT 


1020 




AATGATTTAA 


GAAAAGAGGT 


TTATGCAAAT 


GGCTAGAAAA 


GTTGTTGTAG 


TTGATGATGA 


1080 


10 


AAAACCGATT 


GCTGATATTT 


TAGAATTTAA 


CTTAAAAAAA 


GAAGGATACG 


ATGTGTACTG 


1140 




TGCATACGAT 


GGTAATGATG 


CAGTCGACTT 


AATTTATGAA 


GAAGAACCAG 


ACATCGTATT 


1200 


15 


ACTAGATATC 


ATGTTACCTG 


GTCGTGATGG 


TATGGAAGTA 


TGTCGTGAAG 


TGCGCAAAAA 


1260 


ATACGAAATG 


CCAATAATAA 


TGCTTACTGC 


TAAAGATTCA 


GAAATTGATA 


AAGTGCTTGG 


1320 




TTTAGAACTA GGTGCAGATG ACTATGTAAC 


GAAACCGTTT 


AGTACGCGTG 


AATTAATCGC 


1380 




ACGTGTGAAA 


GCGAACTTAC 


GTCGTCATTA 


CTCACAACCA 


GCACAAGACA 


CTGGAAATGT 


1440 




AACGAATGAA 


ATCACAATTA 


AAGATATTGT 


GATTTATCCA 


GACGCATATT 


CTATTAAAAA 


1500 




ACGTGGCGAA 


GATATTGAAT 


TAACACATCG 


TGAATTTGAA 


TTGTTCCATT 


ATTl'ATCAAA 


1560 


25 


ACATATGGGA 


CAAGTAATGA 


CACGTGAACA 


TTTATTACAA 


ACAGTATGGG 


GCTATGATTA 


1620 




CTTTGGCGAT 


GTACGTACGG 


TCGATGTAAC 


GATTCGTCGT 


TTACGTGAAA 


AGATTGAAGA 


1630 




TGATCCGTCA 


CATCCTGAAT 


ATATTGTGAC 


GCGTAGAGGC 


GTTGGATATT 


TCCTCCAACA 


1740 


30 


ACATGAGTAG 


AGGTCGAAAC 


GAATGAAGTG 


GCTAAAACAA 


CTACAATCCC 


TTCATACTAA 


1800 




ATTTGTAATT GTTTATGTAT 


TACTGATTAT 


CATTGGTATG 


CAAATTATCG 


GGTTATATTT 


1860 




TACAAATAAC 


CTTGAAAAAG 


AGCTGCTTGA 


TAATTTTAAG 


AAGAATATTA 


CGCAGTACGC 


1920 


35 


GAAACAATTA 


GAAATTAGTA 


TTGAAAAAGT 


ATATGACGAA 


AAGGGCTCCG 


TAAATGCACA 


1980 




AAAAGATATT 


CAAAATTTAT 


TAAGTGAGTA 


TGCCAACCGT 


CAAGAAATTG 


GAGAAATTCG 


2040 


40 


* 

TTTTATAGAT aaagaccaaa 


TTATTATTGC 


GACGACGAAG 


CAGTCTAACC 


GTAGTCTAAT 


2100 


CAATCAAAAA 


GCGAATGATA 


GTTCTGTCCA 


AAAAGCACTA 


TCACTAGGAC 


AATCAAACGA 


2160 




TCATTTAATT 


TTAAAAGATT 


ATGGCGGTGG 


TAAGGACCGT 


GTCTGGGTAT 


ATAATATCCC 


2220 


45 


AGTTAAAGTC 


GATAAAAAGG 


TAATTGGTAA 


TATTTATATC 


GAATCAAAAA 


TTAATGACGT 


2280 


TTATAACCAA 


TTAAATAATA 


TAAATCAAAT 


ATTCATTGTT 


GGTACAGCTA 


TTTCATTATT 


2340 




AATgCACAGT CATCCTAGGA 


TTCTTTATAG 


CGCGAACGAT 


TACCAAACCA 


ATCACCGATA 


2400 


50 


TGCGTAACCA 


GACGGTCGAA 


ATGTCCaGAG 


GTAACTATAC 


GCAACGTGTG 


AAGATTTATG 


2460 




GTAATGATGA AATTGGCGAA 


TTAGCTTTAG 


CATTTAATAA 


CTTGTCTAAA 


CGTGTACAAG 


2520 
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15 



20 



25 



30 



35 



40 



45 



50 



GTGATGGTAT TATTGCAACA GACCGCCGTG GACGTATTCG TATCGTCAAT GATATGGCAC 
TCAAGATGCT TGGTATGGCG AAAGAAGACA TCATCGGATA TTACATGTTA AGTGTATTAA 
GTCTTGAAGA TGAATTTAAA CTGGAAGAAA TTCAAGAGAA TAATGATAGT TTCTTATTAG 
ATTTAAATGA AGAAGAAGGT CTAATCGCAC GTGTTAACTT TAGTACGATT GTGCAGGAAA 
CAGGATTTGT AACTGGTTAT ATCGCTGTGT TACATGACGT AACTGAACAA CAACAAGTTG 
AACGTGAGCG TCGTGAATTT GTTGCCAATG TATCACATGA GTTACGTACA CCTTTAACTT 
CTATGAATAG TTACATTGAA GCACTTGAAG AAGGTGCATG GAAAGATGAG GAACTTGCGC 
CACAATTTTT ATCTGTTACC CGTGAAGAAA CAGAACGAAT GATTCGACTG GTCAATGACT 
TGCTACAGTT ATCTAAAATG GATAATGAGT CTGATCAAAT CAACAAAGAA ATTATCGACT 
TTAACATGTT CATTAATAAA ATTATTAATC GACATGAAAT GTCTGCGAAA GATACAACAT 
TTATTCGAGA TATTCCGAAA AAGACGATTT TCACAGAATT TGATCCTGAT AAAATGACGC 
AAGTATTTGA TAATGTCATT ACAAATGCGA TGAAATATTC TAGAGGCGAT AAACGTGTCG 
AGTTCCACGT GAAACAAAAT CCACTTTATA ATCGAATGAC GATTCGTATT AAAGATAATG 
GCATTGGTAT TCCTATCAAT AAAGTCGATA AGATATTCGA CCGATTCTAT CGTGTAGATA 
AGGCACGTAC GCGTAAAATG GGTGGTACTG GATTAGGACT AGCCATTTCG AAAGAGATTG 
TGGAAGCGCA CAATGGTCGT ATTTGGGCAA ACAGTGTAGA AGGTCAAGGT ACATCTATCT 
TTATCACACT TCCATGTGAA GTCATTGAAG ACGGTGATTG GGATGAATAA TAAGGAGCAT 
ATTAAA7CTG TCATTTTAGC ACTACTCGTC TTGATGAGTG TCGTATTGAC ATATATGGTA 
TGGAACTTTT CTCCTGATAT TGCAAATGTC GACAATACAG ATAGTAAGAA GAGTGAAACG 
rAACCTTTAA CGACACCTAT GACAGCCAAA ATGGATACAA CTATTACGCC ATTTCAGATT 
ATTCATTCGA AAAATGATCA TCCAGAAGGA ACGATTGCGA CGGTATCTAA TGTGAATAAA 
CTGACGAAAC CTTTGAAAAA TAAAGAAGTG AAGTCCGTGG AACATGTTCG TCGTGATCAT 
AACTTGATGA TTCCTGATTT GAACAGTGAT TTTATATTAT TCGATTTTAC GTATGATTTA 
CCGTTATCAA CATATCTTGG TCAAGTACTG AACATGAATG CGAAAGTACC AAATCATTTC 
AATTTCAATC GTTTGGTCAT AGATCATGAT GCTGATGATA ATATCGTGCT TTATGCTATA 
AGCAAAGATC GCCACGATTA CGTAAAATTA ACAACTACAA CGAAAAATGA TCATTTTTTA 
GATGCATTAG CAGCAGTGAA AAAAGATATG CAACCATACA CAGATATCAT CACAAACAAA 
GATACAATTG ATCGTACGAC GCATGTTTTT GCACCAA3TA AACCTGAAAA GTTAAAAACA 
TATCGCATGG TATTTAACAC GATTAGTGTT GAGAAAATGA ATGCTATACT ATTTGACGAT 



2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
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GCAAACTATA ACGATAAAAA TGAAAAATAT CATTATAAAA ACCTGTCCGA AGATGAAGCG 444 0 

AGTTCCAGCA AAATGGAAGA AACGATTCCA GGAACCTTTG ATTTTATTAA TGGTCATGGT 4500 

GGTTTCTTAA ACGAAGACTT TAGATTGTTT AGTACGAATA ATCAGTCAGG CGAGTTAACA 4560 

TATCaACGTT TCCtTAATGG TTATCCAACG TTTAATAAAG AAGGTTCTAA TCAAATTCAA 46 20 

GTCACTTGGG GTGAAAAAGG CGTCTTTGAC TATCGTCGTT CGTTATTACG CACCGACGTT 4680 

GTTTTAAATA GTGAGGATAA TAAATCGTTG CCGAAATTAG AGTCTGTACG TTCAAGCTTA 474 0 

GCGAACAATA GTGATATTAA TTTTGAAAAA GTAACAAACA TCGCTATCGG TTACGAAATG 48 00 

CAGGATAATT CAGATCATAA TCACATTGAA GTGCAGATTA ACAGTGAACT CGTACCGCGT 4860 

TGGTATGTAG AATATGATGG CGAATGGTAT GTTTATAACG ATGGGaGGCT TGaATAAATG 4920 

AACTGGaAAC TGACAAAGAC ACTTTTCATT TTCGTGTTTA TTCTTGTCAA CATCGTGTTA 498 0 
GTATCGATTT ATGTTAATAA AGTCAATCGC TCACACATTA ATGAAGTCGA GAGTAACAAT ■ 504 0 

GAAGTTAATT TTCAGCAAGA AGAAATTAAA GTACCGACTA GTATATTGAA TAAATCAGTT 5100 

AAAGGTATAA AATTAGAGCA AATTACAGGG CGATCAAAAG ACTTTAGTTC TAAAGCTAAA 5X60 

25 GGCGATTCGG ATTTGACCAC ATCAGATGGT GGAAAATTAT TGAATGCGAA CATTAGTCAA 5220 

TCGGTAAAGG TCAGTGACAA TAACTTAAAA GATTTGAAAG ATTATGTTAA CAAGCGCGTA 52 80 

TTTAAAGGTG CTGAATATCA ATTAAGCGAG ATTAGTTCAG ATTCTGTAAA ATATGAACAA 5340 

30 ACGTATGATG ATTTTCCGAT TTTAAATAAC AGTAAAGCGA TGTTAAACTT TAATATAGAA 54 0 0 

GATAACAAAG CGACTAGTTA TAAACAATCA ATGATGGATG ACATTAAGCC CACAGATGGT 5460 

GCAGATAAGA AGCATCAAGT GATTGGTGTG AGAAAAGCAA TCGAGGCATT ATATTATAAT 5520 

CGTTACTTGA AAAAAGGTGA TGAAGTCAT7 AATGCTAGAC TCGGTTACTA CTCAGTCGTG 55 80 

AATGAAACGA ATGTTCAATT GTTACAACCA AACTGGGAAA TTAAAGTGAA GCATGACGGT 5640 

AAGGATAAAA CGAATACTTA CTATGTCGAA GCGACAAATA ATAACCCTAA AATTATTAAT 5700 

CATTAATATG AATCGTAATA AGCTAGCATT GCAAGCTCAT CATATGTGAG AAGCGGTGCT 5760 

AGCTTTTTTG CTGGTACGGT TTATTATGGC TGATGTTTTT GCGTCTCCAA CGTGCGCATT 5820 

TATTCATATT TTAAGTAGAA CCGCATTGTA AAATTAGTGT AACTGTTATT TTAAAAACTT 58 80 

TAGTATTTGT CTAATCATTG TTATAATAAT TAAGAAATTC ATTGCACGTG ATTATCAAAA 5940 

TTTAAATATA AGAAACCGGT CGATGAACTA AAGTTACATA ATAGGAAAGG TATACAAAAC 6000 

SO AGCTAATATA CTGATAGTTT CTGTAGGGAA AATCGTATAT TTGCACTGAT GTATATTGCA 6060 

GTCATATAGA GAGATTGACT GTTTAAAGAG AAAGGATGAG CCGCTTGATA CGCATGAGTG 6120 
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TAGTTGATGT TGGTTTGACT GGAAAGAAAA TGGAAGAATT GTTTAGTCAA ATTGACCGTA 624 0 

ATATTCAAGA TTTAAATGGT ATTTTAGTAA CCCATGAACA TATTGATCAT ATTAAAGGAT 630 0 

TAGGTGTTTT GGCGCGTAAA TATCAATTGC CAATTTATGC GAATGAAAAA ACTTGGCAGG 6360 

CAATTGAAAA GAAAGATAGT CGCATCCCTA TGGATCAGAA ATTCATTTTT AATCCTTATG 6420 

AAACAAAATC TATTGCAGGT TTCGATGTTG AATCGTTTAA CGTGTCACAT GATGCAATAG 6480 

ATCCGCAATT TTATATTTTC CATAATAACT ATAAGAAGTT TACGATTTTA ACGGATACGG 6540 

GTTACGTGTC TGATCGTATG AAAGGTATGA TACGTGGCAG CGATGCGTTT ATTTTTGAGA 6600 

GTAATCATGA CGTCGATATG TTGAGAATGT GTCGTTATCC ATGGAAGACG AAACAACGTA 6660 

TTTTAGGCGA TATGGGTCAT GTATCTAATG AGGATGCGGC TCATGCAATG ACAGACGTGA 6720 

TTACAGGTAA CACGAAACGT ATTTACCTAT CGCATTTATC ACAAGACAAT AACATGAAAG 6780 

ATTTGGCGCG TATGAGTGTT GGCCAAGTAT TGAACGAACA CGATATTGAT ACGGAAAAAG 6840 

AAGTATTGCT ATGTGATACG GATAAAGCTA TTCCAACGCC AATATATACA ATATAAATGA 6900 

GAGTCATCCG ATAAAGTTCC GCATTGCTGT GAGACGACTT TATCGGGTGC TTTTTTATGT 6960 

25 TGTTGGTGGG AAATGGCTGT TGTTGAGTTG AATCGGCTTG ATTGAAATGT GTAAAATAAT 7020 

TCGATATTAA ATGTAATTTA TAAATAATTT ACATAAAATC AATCATTTTA ATATAAGGAT 7080 

TATGATAATA TATTGGTGTA TGACAGTTAA TGGAGGGAAC GAAATGAAAG CTTTATTACT 7140 

30 TAAAACAAGT GTATGGCTCG TTTTGCTTTT TAGTGTAATG GGATTATGGC AAGTCTCGAA 7200 

CGCGGCTGAG CAGCATACAC CAATGAAAGC ACATGCAGTA ACAACGATAG ACAAAGCAAC 7260 

AACAGATAAG CAACAAGTAC CGCCAACAAA GGAAGCGGCT CATCATTCTG GCAAAGAAGC 732 0 

GGCAACCAAC GTATCAGCAT CAGCGCAGGG AACAGCTGAT GATACAAACA GCAAAGTAAC 73 80 

ATCGAACGCA CCATCTAACA AACCATCTAC AGTAGTTTCA ACAAAAGTAA ACGAAACACG 744 0 

CGACGTAGAT ACACAACAAG CCTCAACACA AAAACCAAC? CACACAGCAA CGTTCAAATT 7500 

ATCAAATGCT AAAACAGCAT CACTTTCACC ACGAATGTTT GCTGCTAATG CACCACAAAC 7560 

AACAACACAT AAAATATTAC ATACAAATGA TATCCATGGC CGACTAGCCG AAGAAAAAGG 7620 

GCGTGTCATC GGTATGGCTA AATTAAAAAC AGTAAAAGAA CAAGAAAAGC CTGATTTAAT 7680 

GTTAGACGCA GGAGACGCCT TCCAAGGTTT ACCACTTTCA AACCAGTCTA AAGGTGAAGA 7740 

AATGGCTAAA GCAATGAATG CAGTAGGTTA TGATGCTATG GCAGTCGGTA ACCATGAATT 7800 

SO TGACTTTGGA TACGATCAGT TGAAAAAGTT AGAGGGTATG TTAGACTTCC CGATGCTAAG 7860 

TAcTAACGTT TATAAAGATG GAAAACGCGC GTTTAAGCCT TCAACGATTG TAACAAAAAA 7920 
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TGAAGGCATT AAAGGCGTTG AATTTAGAGA TCCATTACAA AGTGTGACAG CGGAAATGAT 
GCGTATTTAT AAAGACGTAG ATACATTTGT TGTTATATCA CATTTAGGAA TTGATCCTTC 
AACACAAGAA ACATGGCGTG GTGATTACTT AGTGAAACAA TTAAGTCAAA ATCCACAATT 
GAAGAAACGT ATTACAGTTA TTGATGGTCA TTCACATACA GTACTTCAAA ATGGTCAAAT 
TTATAACAAT GATGCATTGG CACAAACAGG TACAGCACT7 GCGAATATCG GTAAGATTAC 
ATTTAATTAT CGCAATGGAG AGGTATCGAA TATTAAACCG TCATTGATTA ATGTTAAAGA 
CGTTGAAAAT GTAACACCGA ACAAAGCATT AGCTGAACAA ATTAATCAAG CTGATCAAAC 
ATTTAGAGCA CAAACTGCAG AGGTAATTAT TCCAAACAAT ACCATTGATT TCAAAGGAGA 
AAGAGATGAC GTTAGAACGC GTGAAACAAA TTTAGGAAAC GCGATTGCAG ATGCTATGGA 
AGCGTATGGC GTTAAGAATT TCTCTAAAAA GACTGACTTT GCCGTGACAA ATGGTGGAGG 
TATTCGTGCC TCTATCGCAA AAGGTAAGGT GACACGCTAT GATTTAATCT CAGTATTACC 
ATTTGGAAAT ACGATTGCGC AAATTGATGT AAAAGGTTCA GACGTCTGGA CGGCTTTCGA 
ACATAGTTTA GGCGCACCAA CAACACAAAA GGACGGTAAG ACAGTGTTAA CAGCGAATGG 
CGGTTTACTA CATATCTCTG ATTCAATCCG TGTTTACTAT GATATAAATA AACCGTCTGG 
CAAACGAATT AATGCTATTC AAATTTTAAA TAAAGAGACA GGTAAGTTTG AAAATATTGA 
TTTAAAACGT GTATATCACG TAACGATGAA TGACTTCACA GCATCAGGTG GCGACGGATA 
TAGTATGTTC GGTGGTCCTA GAGAAGAAGG TATTTCATTA GATCAAGTAC TAGCAAGTTA 
TTTAAAAACA GCTAACTTAG CTAAGTATGA TACGACAGAA CCACAACGTA TGTTATTAGG 
TAAACCAGCA GTAAGTGAAC AACCAGCTAA AGGACAACAA GGTAGCAAAG GTAGTAAGTC 
TGGTAAAGAT ACACAACCAA TTGGTGACGA CAAAGTGATG GATCCAGCGA AAAAACCAGC 
TCCAGGTAAA GTTGTATTGT TgtAGCGCAT AGAGGAACTG TTAGTAGCGG TACAGAAGGT 
TCTGGTCGCA CAATAGAAGG AGCTACTGTA TCAAGCAAGA GTGGGAAACA ATTGGCTAGA 
ATGTCAGTGC CTAAAGGTAG CGCGCATGAG AAACAGTTAT TTCATAATCA ACAGTCATTG 
ACGTAGCTAA GTAATGATAA ATAATCATAA ATAAAATTAC AGATATTGAC AAAAAA7AGT 
AAATA 

(2) INFORMATION FOR SEQ ID NO: B8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3836 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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AAGTTGAACA TAGTGACATT ATGACAGCAA GTCCAGAGAT GGCTG ACTTG TTTATTTGTG 18 00 

GTAGAGATTT AGCTGAAAAT GCCGAACGTC TAGGGGATGT CTTAGTTCTT GATAATATTT I860 

TAG AT AAAG C TGAATTACAA CAAAAGCTCT CAGAAAAATT ACAACAACTT AACATGATTT. 1920 

AAAGGAGGTA CGACCTATGC AAGCAATCCT TAATTTTATA GTCGATATTT TAAGTCAACC 1980 

AGCCATTCTT GTTGCACTGA TTGCCTTTAT AGGTTTAATC GTTCAGAAAA AACCTGCCGC 2040 

AACGATCACT TCAGGAACCA TTAAAACGAT ATTAGGCTTC TTAATTTTAA GTGCAGGTGC 2100 

TGATGTCGTC GTTCGATCTC TTGAACCATT CGGCAAAATA TTCCAACACG CATTTGGTGT 2160 

GCAAGGTATC GTACCTAACA ACGAAGCTAT CGTCTCACTA GCCTTAAAAG ATTTTGGAAC 2220 

AACAGCTGCA CTCATCATGG TCTGTGGCAT GATTGTTAAT ATTTTAATTG CCCGCTTCAC 2280 

TAATTTAAAA TATATCTTTT TAACAGGTCA TCATACATTT TACATGGCTG CGTTTTTAGC 234 0 

20 AATCATTTTA ACAGTCAGTC ATATTAAAGG CTGGCTAACG ATTGTTATCG GCGCACTCGT 2400 

ATTAGGATTA ATCATGGCAG TATTACCTGC ATTACTCCAA CCTACGATGC GAAAAATTAC 2460 

AGGGAATGAC CAAGTAGCTT TAGGTCATTT TGGCTCAATC AGTTACTTTG CCGCAGTGCT 2520 

25 GTAGGTCAAT TATTCAAAGG TAAGTCTAAA TCAACGGAAG AGATTAAATT TCCAAAAGGC 2580 

TTAAGTTTCT TACGAGAAAG TACAATTAGT ATCTCGATTA CGATGGCATT ACTTTACTTC 2640 

ATCGCATGCT TATTTGCGGG CGTTAGTTAT GTACACGAAT CTATTAGTGA TGGTCAAAAC 2700 

TTTATTGTCT TTTCATTAAT TCAAGGTGTG ACATTTGCTG CTGGTGTATT TATTATTTTA 2760 

ACGGGCGTTC GTTTAATCTT AGCTGAAATC GTCCCAGCAT TTAAAGGAAT TTCTGAAAAG 282 0 

CTTGTACCAA ATTCTAAACC TGCATTAGAC TGCCCTATTG TGTTCCCTTA TGCACAAAAT 288 0 

GCAGTATTAA TTGGATTCTT TGTCAGCTTT ATTACAGGTG TCATCGGTAT GTTTATCTTA 294 0 

TTCTTATTTG GTGGCGTCGT CATTTTACCT GGCGTAGTTG CACACTTCTT CTTAGGTGCA 300 0 

ACGGCTGCTG TATTCGGTAA TGCAAGAGGC GGTATTAAAG GTGCTATTGc TGGCGCCGCT 306 0 

CTAAATGGTA TCCTAATCAC GTTTTTACCA TTATTATTCT TGCCATTTTT AGGCGAATTA 312 0 

GGTGGTGCTG CAACAACATT CTCAGATACA GACTTTTTAG CTGTCGGTAT CGTGTTCGGT 318 0 

4S AACGCAGTAA AATATATGGG ATTATTTGGT GCGATTCTAT TTATTATTAT CGTAGGTGCG 324 0 

ACAACAATTT TATTAAAAGG CCGTCAAAAA GAACAGCAAT AGTGTTAACG TAGAAATATA 3300 

AAACACCGTC ACATATTGAG TGAATGCCCC TTTtATCAAG AGGAAAGCCA CTTACTTATG 336 0 

50 GACGGTGTTT TGTATTATAT TAAATGATAC TTAGCCATAC TATCGACAGC TGCTAAAATT 3420 

GCTTCTTCTT GTGTCGCAAT CGGTTCCCAA CCAAGTAATG TTTTTgCACG TTCGTTACTT 3480 
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CCTAGACTCA AAATAAAGTC TGGTAATTTT TTAGTAGAAA CTTTTTGAGC TATTTCAGGT 3600 

CTCTTTTCTT TAATTAATTT TGCAATTTCC AACAAATTAA TTTGTCCATC AGCCGTCGCA 3660 

5 ATAAATCGCT TGCCATTAGC TTGTTCATTT GTCATTGCCA AAATGTGCAG TTCAGCTACG 3720 

TCTCTCACAT CAACAACATT TAACGGAATT TGCGGTACAC GTTTCATTGA ACCATTCAAT 3780 

AAATTTTCTA ATAAATGAAA GCTTCCTGAA ACGTGTGCAT CTAATGATGG CCCAAAAATT 3840 

10 

GCAACTGGAT TGATTGTGGC AAATTCTACT GTTGTATTTT CATTCT 3886 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 
is (A) LENGTH: 4879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 





GTCATCTATC 


AAAAATTTGG 


TATACAGACC 


GACAATTATT 


AATTAATAAT 


TTAATTTCCC 


60 


25 


AGGCAATACC AGTGATTAAA 


TATCCACAAA 


TACAACATAA 


AGAACAACCA 


TTAGAATCTA 


120 




TTTCACAACT 


TATATTGTCT 


AAGATGACAT 


CTAATCAATA 


GTGTTTAAAT 


TTCTCAGTGG 


180 




CTGTGAATGA 


GGTTTAAAAG 


TACTATAAAA 


CGTAAACTTT 


GATACTTTAA 


AATACGCAAA 


240 


30 


AAACGGTAAA 


CCCTAATTCA 


TATTATAGAG 


TTTACCGTTT 


TATTTTTTAA 


CTTGCATCAT 


300 




AGTTATATTA 


ACATTATTGT 


TGGTAGTTTG 


GATCAGTAAC 


CATTGCTTGT 


CCAGTATAAT 


360 




CAACCGTTAC 


AATTGAATAT 


TTTCCaTTTG 


CATTTGGGTC 


TTTAAAACTA 


AACACATACT 


420 


35 


TATAGTTGCC 


ATTATGTTCT 


TCAATAGAAT 


AATCATTATA 


CACTTTATTA 


TTACTACCAA 


430 




A'LTi'&TTTGC 


TTCATTATTA 


GCCGCATTTA 


AAGCTGTTTG 


GAAATTTGGC 


AATTGCTGTA 


540 


40 


AAGCTTGATT TTTATTTCCA TTAAACGGAT AAATTTGACG 


TGCAACCGGC 


GCGGCATTTT 


600 


GnCCATAATA 


TGGTGCAACG 


TAACTTGATT 


TTTGATTATT 


ATTCGCTTGG 


TTATTACTTG 


660 




ATTGGTTATT 


ATTTGTTTGG 


TTTTGGTCAT 


TGTTTGTTGC 


ATTTGAATTA 


GATTGTTGCT 


720 


45 


GGTTATCGTT 


TGCACTATTA 


TCTTTATTAT 


CTTTGTTTAC 


GTCTTTACTA 
TTGTTGTTCA 
ATCTTGACCA 


TCATCTTTAT 


780 
840 
930 




TATCTTTCTT 


ATCTTTAGAT 


GAATCATTTG 


TrrrriTATC 


GTTTTCGCTT 
CATGCAGCTA 




TATCATCTTT 


TTCTTTATTA 


CCGTCTTTTT 


GTTGGTCACT 


SO 


AAAATAATGA 


TAATGCTAGT 


AACCCTGTAA 


CTAATCTTTT 


CATACATATC 


TCCTCCTATA 


960 




ATT CG AT ATT 


CATTGAATAA 


TCTTGAAATA 


CATATCTACC 


ATGTGTATCT 


TTTCATGGCT 


1020 
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TAAGGTTCTT TTTATTATAC CCTAATTTTT GTTCATTATT ATTTAATTTT TGTGAATTTT 114 0 

ATGtTTkCTA TAAATTTAAT TATTTTACTT TAACAATTCA TTACGCATTT AGCATTTCAA 1200 

GGTATACACA ATATTTATTA CTATGATTTC ATTTTATCTG CTGCAAAAAC AATCATTATA 126 0 

ACTCTTTTTC CATAATTAAA TCTGTATCCG TTACATCACC TGTTTGAAAA TGATGTTCAC 1320 

CAACCACTTT AAATCCATGA CGTTTATAAA ATGCTTGAGC ACGAGGATTA TGCTCCCAAA 13 80 

CTCCTAGCCA AATTTTATGT TTATTATGTT CTTGAGCAAT TTTTTCGGCC AATTCTATCA 144 0 

ATTGTGAACC TCTTCCGCCA CCTTGAAAGT CTTTCAAAAA ATATATGCGC TGCACTTCTA 1500 

AATAGGTCTC CCCCATTTCT TCAGTTTGAG CACTATTAAT ATTCATCTTT ATATAACCAA 1560 

CATTCGCACC ATCTTCTTGa TAAAAATAAT GAAATGAATC TACATGGTTA ATCTCTTGTG 1620 

TAAATTTCTC TACAGTATAA TTGTCTTTAA AAAATTGATC AAAATCTTTG TCATCATAGT 1680 

AAGAACCAAA CGTGTCATAA AATGTTCTAG TTGCTAATTC AACTAATTCA CTAGCATTTT 1740 

GTTCTGAAAT TTCTTTGATT ATCCCAGCCA TATAAATCCT CCAATAAACA GTGATCGAAT 1800 

CAAAATATTA CTTATGTTAT TTTTCAGCCA AAACTATTTA AAAATACATT AACACAAATC 1860 

25 AATTACAAAT TGTATTGATT GTGTGTAACA TCAATAAATG ATACATTTAT TCCAGTAAAA 1920 

TGGCCGTATT TTCAAAAGAG AAAAAGAGAG GATGTATCGT TGTGATAGAA ACATTTAAAG 1980 

CGTTTGTAAT TGATAAAGAT GAGAGTGGTA AAGTGACACC AACTTTCAAA CAATTATCGC 204 0 

30 CTACTGATTT ACCTAAAGGA GATGTGCTGA TTAAAGTACA TTACTCTGGT ATAAATTATA 2100 

AAGATGCTTT AGCGACTCAA GATCATAATG CAGTCGTAAA ATCGTATCCT ATGATTCCAG 2160 

GAATAGATTT AGCTGGAACA ATTGTTGAAT cCGAAGCACC AGGCTTTGAa AAAGGAGAAC 2220 

AAGTAATTGT AACGAGTTAT GACCTAGGTG TCAGCCATTA TGGCGGTTTT AGTGAATATG 2280 

CGCGTCTAAA ATCAGAATGG ATTATCAAGC TTCCTGATAC TTTAACATTA GAAGAATCAA 2340 

TGATATATGG CACAGCTGGT TATACTGCCG GTTTAGCAAT TGAAAGACTT GAAAAAGTTG 2400 

GAATGAATAT TGAAGATGGT CCTGTACTCG TTCGCGGTGC TTCAGGTGGT GTCGGTACTT 2460 

TAGCAGTACT CATGCTTAAT GAACTTGGTT ATAAAGTTAT CGCAAGTACA GGTAAACAAG 2520 

ATGTTAGCGA TCAATTACTT GAACTTGGTG CCAAAGAAGT TATCGATCGA CTTCCTGTTG 2580 

AAGATGATCA TAAAAAGCCA CTCGCATCAT CAACTTGGCA AGCTTGTGTA GACCCTGTTG 2640 

GTGGCGAAGG TATTAATTAT GTTACAAAGC GTTTAAATCA TAGTGGGTCA ATTACAGTTA 2700 

50 TTGGTATGAC TGCCGGTAAT ACTTATACTA ATTCTGTATT CCCTCACATT TTAAGAGGTG 2760 

TAAACATTTT AGGAATTGAC TCGGTATTTA CTGCTATGAA ATTAAGACAG CGCGTTTGGC 2820 
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TTGATGAACT TCCAGAACAA CTTAACAAAG TAATTAAACA TGAAAATAAA GGGCGCATTG 2940 

TTATCGATTT CGGTGTAGAT AAATAGTATT CATGAAAAAG ACATCCCGTT ATGCGAGATG 3000 

TCTTTTTTAA TTTAGTATTT GATATACATA CCGCCTGAAT CTGGTTCGGT AGGTATAAAT 3060 

CCAAATTTTG TATATAATTT ATCCGCTGGG TAGTCTGCAA TCAGAcTAAC GTATGTACTC 3120 

TCAACAGCCA CACCTTTAAT ATATTGCATA ATATGCTCCA TAATTAGACT GCCGTAACCT 3180 

TGACCTTGGT AACTTTTCAA AACTGCAATA TCAACAATTT GAAAAACAGT TCCGCCATCG 3240 



CCAATCACTC TACCCATACC AATTAAC CGA TCTTTATCAT ACAAGGTTAC TGTAAATAAG 3300 

GCATTAGGTA ATC C TT T TTC aGCTGTTCGC GCGTCTTTGG ACTCATACCT GCGTTAATCC 3 3 60 

TTAATGCGCA ATAATCCTCG CAAGTCGGAA TATCATATGT CACTTTAACC ATTATTTACC 3420 

CCACTTTTCA TCACACAATA TATCAACCTA GTATAAATGT TTATTTACAA TAGTCTTATT 3480 

CGCTTCTTTA AACACTTCAT GATGACTTGA AACATAACCC TCTGCATTCG CATCTGGTTG 3540 

GATATATGTT TTAGCAAGGT TCGCTGCATT TGCACCATCA CTAAATGCAC TTGCAATTAG 3600 

ATGTGATTTT GCATCATGAT AAACAATATC TCCACACGCA TAGATACCAG GTATACTAGT 3660 

25 TGTCGTATTA CCAAATCCTT TAACACGACA ATCATCATGC ATATCTAGCT TTGAAGATGT 3720 

TtCACTCAAT AATGTATTAC AACGATCAAA CCCATGACTA ATAATGACAT CGTCAAATTT 3780 

AACTGTATGC CTATCGCCAC TTTCAACATG TTCCAAAACA ACTTCACTTA TATGCGTTTC 3840 

20 ATCATCATTG CCGACCAAGT ATTTAATACG TGTTTTTGGG CATAGTTTCA CATTTAAATC 3900 

TGTCACCAAC GTTTTCATCG CTTCATGACC ACTTACATCT TCTTTTCGAT AAACAACTGT 3 960 

CACGCTTTTA GCAATCTTGG CAATATCATG CGCCCAATCT AATGCTGTAT TTCCTCCACC 4020 

TGATATTAAT ACATCTTTAT CTTTGAAACG TCTGTAACTT TGTACAACAT AATGTAAATT 4080 

AGTT5ATTGA TATCTCTCTA CACCTTTAAC ATCTAATTGT TTTGGATTAA TAATACCCGC 4140 

ACCAATTGCA ATGATAACTG CTTTCGATGT ATATATTTCT CCCGCTTCTG TTTCAACTTC 4200 

GAAATGACGT TCTGCCTTTT TCCTAATATC TACCACACGT TCATTCAAAT GAACTTCCGG 4260 

TTTAAAATAT AATCCTTGCT TAATTGTATC TTTTAAAATT TCATGACAAG GTTTTGGCGC 4320 

AATGCCGCCA ATATCCCAAA TAATTTTTTC AGGGTAAATT CTCATCTTAC CCCCTAATTC 43 80 

AGATTGAACA TCTATCAATC TTACAGACAT ATCTCGCAAT CCAGCATAAA AGCTTGCATA 4 4 40 

CAAACCAGAC GGACCGCCAC CAATGATTGT AACATCTTTC ATTATGTGCC TCCTATGACT 4500 

SO CTCTATATTC ATTTCTTTCA TTAACGTGCT CAAATTGATA ATTATTATCA TTTAAAGCCA 4560 

TTATACTATT AATATTTATA TTGTTAAAAT AAATCGCATA GTTAGCCATG AATTATCAAT 4620 



55 



561 



EP0 786 519 A2 



GAAAGATGTG TATATTTTTT AGTTCTAGTT ATATTATTTT TTAAAAGACT CATCACGTGG 474 0 

TTCTTTAAGA ATTGCTTGTC TTAAAAGGAA AAATAGCAAC AATAAACCTG CAAGCATACC 4 800 

TGTGTGCCCA ATACCTGCAA AGCCTGCnAA TGCTTCTGGA GAGTATGATT TACCAGTGAC ' 4 860 

TTGGAAGAAT CCTTTTGTC 4 879 



(2) INFORMATION FOR SEQ ID NO: 90: 

10 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1560 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

20 ATAATGTCTT AGaTTGATTG GGAGTTTTTT TAATTTTTTT GAAATTAAAT TAATCTGTAS 60 

yTAATAAAAA ATTTGAATAA CTGACACAyT TTTTTGATCA TAG CT Ay AT A CTTTGTGAAT 120 
TAATTCACAT TATAATAAGA GTGAAGATAA GAGTATTATA AATnATCTTT AAATAAATAT 180 

25 ATGTGAAGTA AAAATTACAC GTTAGCATAT CGATTATGgT CATTTCkTTT AACATATTAA 240 

CTgGGGaACG TTAAAAGTTA ACGGkTGATA TCyAACtAAA AACAAGGTCA CAGTAGTATG 300 
TTTTAATCTG GCGTCTATTA CAAATAAAAA TTACATCTAT AATTATTCGT TTTCTTTTTT 360 

30 GAAAGTAATA GCCAATTAAT ATCATACATA CTGGAGTGAC TATAAGGAGG ACATTATTAT 420 

GAGAGCAGCA GTTGTAACGA AAGATCACAA AGTAAGTATT GAGGACAAAA AGTTAAGAGC 4 80 

TTTAAAACCT GGTGAAGCGT TGGTACAAAC GGAATATTGT GGCGTTTGTC ATACCGATTT 540 

35 

ACATGTTAAG AATGCTGATT TTGGTGATGT TACAGGCGTT ACTTTAGGTC ATGAAGGTAT 600 
TGGTAAAGTC ATCGAAGTTG CGGAAGATGT AGAATCATTA AAAATTGGAG ACCGTGTGTC 660 
TATCGCTTGG ATGTTCGAAA GCTGTGGAAG ATGTGAATAT TGTACAACAG GTCGTGAAAC 720 

40 

ACTTTGCCGT AGTGTGAAAA ATGCTGGTTA TACAGTAGAT GGTGCAATGG CTGAACAAGT 780 
TATTGTTACT GCAGACTATG CTGTGAAAGT ACCTGAAAAA TTAGATCCAG CAGCAGCGTC 840 
TTCTATTACA TGCGCAGGTG TGACAACTTA TAAAGCTGTA AAAGTAAGTA ATGTAAAACC 900 
TGGACAATGG TTAGGTGTTT TTGGTATAGG TGGTTTAGGT AACCTAGCTT TACAATATGC 960 
TAAAAACGTT ATGGGGGCTA AAATTGTTGC CTTCGACATC AATGATGATA AATTAGCATT 1020 
SO CGCGAAAGAA TTAGGTGCTG ATGCTATTAT TAATTCTAAA GATGTTGATC CAGTTGCAGA 1080 

AGTTATGAAA TTAACTGATA ACAAAGGATT AGATGCAACA GTGGTAACTT CAGTTGCTAA 1140 
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TTTACCTGTT GATAAAATGA ACTTAGATAT CCCAAGATtA GTGCTTGATG GTATTGAAGT 
AGTAGGTTCA CTTGTTGGTA CAAGACAAGA CTTACGTGAA GCGTTTGAAT TTGCTGCTGA 
AAATAAAGTA ACACCTAAAG TTCAATTAAG AAAATTAGAA GAAATCAATG ATATTTTTGA 
AGAAATGGAA AATGGTACTA TAACTGGTAG AATGGTTATT AAATTTTAAA AATATCAACT 
GACTATATAG ATAAAGAAGG TAGTGCTCTG AACACTATCA TTATTAATCA AACCCCGAGG 
TTTTCCTGAA AAGATAGTGG nAAATCCCCG TGTTTTTTGG GTTTGAGGnG GTTGTnTGTA 
(2) INFORMATION FOR SEQ ID NO: 91: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11014 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



GTCCTGTnGC 




zirrrvTA a a a 


a Trr aczczcz at 


ftTAATfttlATA 

vj inn X \J\Jf\ 


x x J. vy x in 


GTACTAATGA 


TAGAAATGAT 


AAAAATGAAA 


TCACAAAGGC 


TACGCTCGCA 


AAAGCTTGAC 


ATGTACGCTT 


ATCGCCATAA 


TCTAACCCTG 


TACGTATATG 


TAATAAATAC 


TGTAATCCGA 


TACTTAAATA 


CATAATTGCC 


ACGCATAAGA 


AGAATGGGAA 


GAATGTCTTT 


TCAAAGTCCG 


GATATAGGCT 


GTTAGATAGG 


AAGAC CATGA 


TGAACATATT 


AAACATCATA 


AACGAGACGT 


CITTGAATGT 


AACTTGACCA 


AATCGATTTG 


TAAAAAATGT 


TTGATGAGAC 


CACATTAACC 


ATAAGAACAA 


ACTCATGACG 


ATGTATTTGA 


AAAATAAATC 


AGCTGAAATG 


GAACCGTTTT 


GTGTTGTTAA 


AATCACATGT 


GCAATTTTTT 


GAATGGCATA 


GACGAAAATT 


AAATCAAAGA 


ACAACTCATG 


GAATCCTGCA 


CGCTTTTCAG 


CTAAATGTTT 


TGGTGTTAAT 


GCATTAACCA 


TAAAAITITA 


ACTCCTTTAA 


GATGTGTAAT 


TAATTTACTA 


AGTATACTAT 


TTATTTTTTC 


TAGTGAATAG 


GGGCAGATTT 


GGCGATGAAG 


TGGAAGGAGA 


GGTGACTGCA 


AGGTAATTGC 


GGAATTAACA 


ATCATCAGCG 


ATTTAATATT 


TGACTGGAGA 


CGTCATGGTA 


ATAAAAAATT 


GATGAGAAAT 


TGATGGTGAA 


ACCAGCTGTG 


AATAsCGaTG 


cAATGATrsA 


TAGaATTTAA 


TTAGAGTCAT 


TACGCGaAAT 


GATTAATGAT 


AATTTGTGGT 


AAATCAAAGC 


aTAATTTTGT 


ACTATAGATG 


AGGATGATAG 


AGCATATTTA 


AGAGGGTGAA 


ATGTTAAAGT 


GAAACCGTTT 


ACGTTTCCGA 


TTGCCCAAAC 


AAATTACATC 


ATTGTATAAT 


ATGATTTGTT 


AAATGCATAA 


CAAGAATGAA 


AATGTAACAT 


ACGTAGCAAT 


TGGTTTCATA 


AATTGGATGT 


TAGTGGCGTA 
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TGACGAGAGT CGTATTAGCA GCAGCATACA GGACACCTAT TGGCGTTTTT GGAGGTGCGT 114 0 

TTAAAGACGT GCCAGCCTAT GATTTAGGTG CGACTTTAAT AGAACATATT ATTAAAGAGA 1200 

CGGGTTTGAA TCCAAGTGAG ATTGATGAAG TTATCATCGG TAACGTACTA CAAGCAGGAC 1260 

AAGGACAAAA TCCAGCACGA ATTGCTGCTA TGAAAGGTGG CTTGCCAGAm ACAGTACCTG 1320 

CATTTACGGT GaATAAAGTA TGTGGTTCTG GGTTAAAGTC GATTCAATTA GCATATCAAT 1380 

CTATTGTGAC TGGTGAAAAT GACATCGTGC TAGCTGGCGG TATGGAGAAT ATGTCTCAAT 1440 

CACCAATGCT TGTCAACAAC AGTCGCTTTG GTTTTAAAAT GGGACATCAA TCAATGGTTG 1500 

ATAGCATGGT ATATGATGGT TTAACAGATG TATTTAATCA ATATCATATG GGTATTACTG 1560 

CTGAAAATTT AGTAGAGCAA TATGGTATTT CAAGAGAAGA ACAAGATACA TTTGCTGTAA 1620 

ACTCACAACA AAAAGCAGTA CGTGCACAGC AAAATGGTGA ATTTGATAGT GAAATAGTTC 1680 

20 CAGTATCGAT TCCTCAACGT AAAGGTGAAC CAATCGTAGT CACTAAGGAT GAAGGTGTAC 1740 

GTGAAAATGT ATCAGTCGAA AAATTAAGTC GATTAAGACC AGCTTTCAAA AAAGACGGTA 1800 

CAGTTACAGC AGGTAATGCA TCAGGAATCA ATGATGGTGC TGCGATGATG TTAGTCATGT 1860 

25 CAGAAGACAA AGCTAAAGAA TTAAATATCG AACCATTGGC AGTGCTTGAT GGCTTTGGAA 1920 

GTCATGGTGT AGATCCTTCT ATTATGGGTA TTGCACCAGT TGGCGCTGTA GAAAAGGCTT 1980 

TGAAACGTAG TAAAAAAGAA TTAAGCGATA TTGATGTATT TGAATTAAAT GAAGCATTTG 2040 

CAGCACAATC ATTAGCTGTT GATCgTGAAT T AAAATT AC C TCCTGAAAAG GTGAATGTTA 2100 

AAGGTGGCGC TATTGCATTA GGACATCCTA TTGGTGCATC TGGTGCTAGA GTATTAGTGA 2160 

CATTATTGCA TCAACTGAAT GATGAAGTTG AAACTGGTTT AACATCATTG TGTATTGGTG 2220 

GCGGTCnAAC TATCGCTGCA GTTGTATCAA AGTATAAATA ATAAGAAAAC AGGTTATCAC 2280 

AACASTATTA ATtACATGTT GGCATAACCT GTTTTTATTT GTTTATGGAT TTATTGGGTA 2340 

ATATTAGTCA TTTGATGGTT TAATTGCAAA TGCTCTAACA GGGAACCCAG GTGCATCTTT 2400 

TGGTTTAGGG CTGATAGCGT AAATGATGGC GCCACGAGTT GGTAATTGAT CTAAATTAGT 2460 

TAATAACTCG ACTTGGTATT TATCCTGACC AAGAATATAA CGTTCGCCAA CTAAAT CACC 2520 

45 AT T TTTTACA ACGTCCACAG ATGCATCGGT ATCGAATGTT TCATGACCAA CAGCTTCAAC 2580 

ACGACGTTCT TCAATTAAGT ACTTCAAAGC ATCTAATCCC CAACCCGGTG CATGTTGTTG 2640 

TCCGTTCGCA TCTTTGTTTT CAAACTTTTC AATATTAGGC CAACGTTTTG ACCAATCGGT 2700 

50 ACGAAGTGCA ACAAAAGTGC CAGGTTCAAT AGTACCATGC TCTTTTTCCC ATGCTTCTAT 2760 

ATGCGCACGT GTTACGATGA AATCATTGTT GTTCGCTACT TCTGTTGAAA AGTCTAATAC 2320 
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AAAGTGAATT GGTGCATCAA TGTGAGTACC ATATTGCGTT ACAATATTCC AACGTTGCAC 2940 

ATAGAAACCA TGATCTTTAA CCGTGAATAA AGTTGAAACT TCGCCTTTTT CAAACTCACT 3000 

AAAACGTGGT ATTTCCGGAT CAAATGTATG CGTTAAATCA ACCCAAGTTG CTTGTTTTAA 3060 

AGTATTTAAT TGTTGCCATA AAGGATATTG TGTCATAAAA TCACCCGTTT TTAGTTTATT 3120 

ATATGATAAA TGCTGCGATT ATTCTTGGCG TTTAGCTTTA ACAGCATTCA CAAGCACAGT 3180 

CAATGCATCT TTAACTTCTT CTTCTTTTCG CGTTTTTAAA CCACAGTCAG GGTTTACCCA 324 0 

GAATAATGAG CGGTCGATTT GTTGTAGTGA ACGATTGATT GCTGTAGTAA TTTCTTCTTT 3300 

TGTTGGAATA CGTGGACTAT GAATATCATA TACACCTAGA CCAATACCTA AATCATAATT 3360 

AATATCTTCA AAGTCTTTAA TTAAATCACC ATGGCTACGA GATGTTTCAA TTGAAATAAC 3420 

ATCAGCATCT AAGTCATGAA TAGCATGAAT GATTTGACCG AATTGAGAAT AACACATATG 34 80 

20 TGTATGGATT TGAGTTTCAT CACGAACTGA AGACGTTGCA AGTTTAAATG ATAAAACAGC 3540 

ATCTTTAAGA TATTGTTCGT GATATTCAGA GCGTAATGGT AAGCCTTCAC GTAATGCAGG 3600 

TTCGTCAACT TGGATAACTT TGATTCCTGC AGCTTCAAGT GCTAATACTT CTTCGTTGAT 3660 

25 TGCTAAAGCA ATTTGATCTT GAACGACTTT ACGTGGTAAA TCAACACGTT CAAATGACCA 3720 

GTTTAGAATT GTTACAGGTC CAGTTAACAT ACCTTTAACT GGTTTATCTG TTAAGCTTTG 3780 

TGCATAAACT GTTTCATCAA CAGTTAAAGG CGCTGTCCAT TTTACATCAC CATAAATGAT 3840 

TGGTGGTTTT ACGGCACGTG AAC CATATGA TTGCACCCAA CCGAATTTAG TTACTAAGAA 3900 

ACCTTGTAAT TTTTCTCCGA AGAATTCAAC CATGTCATTA CGTTCAAATT CACCGTGAAC 3960 

TAATACATCT AAGCCAATGT CTTCTTGAAT TTTAATCCAT CGAGCAATTT CATTTTTTAA 4020 

GAATGTTTCA TATGCTTCGT CTGTAATGCG TTTGTTCTTC CAATCTGCAC GGTATTTTCG 4080 

AACTTCTCGG CTtTGTGGGA ATGATCCAAT AGTTGTTGTT GGTAAATCCG GTAAGTTCAA 4140 

ACGTTTTTGT TGTTGTTCAA TACGTTGCGC GAATGGTGAT TGTCTTGAAG TACGCACGCT 4200 

TTCGAAATCA TAATCTAAGT TTTTGAATGA TTGATTTTGG AAACGCTCAT AACGTGCTTT 4260 

TAATTTATCA TATTTAACAC TATCGTTTTG ATTAAA7AGG CGACGCAATG CATCTAATTC 4320 

GTCTAATTTT TCAGTTGCAA AGCTTAAGCC TTCGCCAACA CTTGTATCTA ATGTTTCATC 4380 

ATCTAAAGAT ACTGGAACAT GTAATAATGA AGATGATGGT TGAATGACAA GTTCATTAGT 4440 

GTGTGCTAAC AATTTATCGA TTAAGACTTT TTTAGCTTCA ATGTCACTTG CCCATACATT 4500 

50 ACGACCATCA ATAATTCCAG CGTATAATGT TTTTGATTTA TCAAAATCTC CAGCTTCAAT 4560 

TTGTTTAAGG TTATAGCCAT TATCATGGAC AAAGTCTAAA CCTATACCAC CAACAGGTAA 4 620 
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AACACCAGCT TTTTCGAAAT AGTCATAAGC TTCACGTGTA ATATTTTCAT AGCTTTCGCT 4 740 

GTCGTCTGTA ACTAAGATTG GCTCATCAAC TTGAATGTAC TCAGCACCTG CATCAATTAA 4800 

TGATTCAAAC ACTTCTTTAT AAAGTGGTAA TAACGTTTTA ACTTTTTCTT CAAAAGTTTG 4 86 0 

GTGACCGCCT TTTGATAATT TAACAAAAGT AATCGGACCA ACAATGACAG GGTGAGCGTT 4920 

AACGTTTAAA GATTGGGCAT ATTTAAAGCG ATCTAATAAT ACATTGCGAC TCACTTTAGG 4 98 0 

CTCAACATTG TCCCATTCAG GTACGATGTA ATGATAGTTA GTGTTAAACC ATTTTATAAG 504 0 

TGCACTTGCA ACATGGTCTT TATTACCGCG AGCAATATCA AATAATAAAT CATCATCAAT 5100 

AGTTCTTCCT TGGAAACGTT CAGGGATGAT GTTGAATAAT AATGACGTAT CTAATATATG 5160 

GTCATATAAA GAGAAATCAC CAACTGGGAT GCTATCTAAG TGATAGTACT TTTGtAATAA 522 0 

TAAATTTyCT TTATGTAGAT CAGTTAATGT TTG AT CTAAT TCTTCTTTAG AAATCTTCTT 5280 

20 TGCCCAATAA CTTTCGATGG CTTTTTTCCA TTCTCTTTTT CTACCTAATC TTGGGAATCC 534 0 

TAAGTTTGAT GTTTTAATTG TTGTCATAAT ATTGCCTCCT TGTGAGCAGT AATAGATTTT 5400 

GAGTATGCTG CAAGTTCTAA TGAATCTTCG ACATTTTGAA ACGGTGTGAT AATGTATAAA 5460 

25 CCATTAAAAT ATTCATGAAC AGTATCGATT AAATCCTTTG AAAGCTTAAG ACTTAGTTCT 552 0 

CGTGT7TTGG CTTTATCATC TTTAACTGCT TCAAATTGTT GTAAAATTTC ATCTGACATC 5580 

TTGATTCCTG GCACT7CATT ATGCAAAAAG AGTGCGTTTT TGTAACTTGC GATAGGCATA 564 0 

ATGCCTATGA AAAATGGTTT GTTCAAGTGC TTAGTGGCAT GGTAAATTTC AATGATTTTC 5700 

TCTTTGCTGT ACACGGGTTG TGTTATAAAA TAAGACATTC CGCTTTCTAT CTTTTTCTCT 5760 

AATCTTT7GA CGGCACCATA TAATTTACGA ACATTAGGGT TAAAGGCGCC AgcGATGTTG 5820 

AAGTGTG7AC GTTTCTTCAG CGCATCACCG TCAGTGTTAA TACCTTGATT AAATCTTAGA 5880 

GCGSGTTCAG TTAATCCTTT AGAATTAACA TCATAGACAT TGGTTGCACC TGGTAAGTGA 5940 

CCAACTTTTG AAGGATCACC AGTTATGGCT AATATTTCGT TAACGCCAAT GAGCGATAAT 6000 

CCAAGTAAAT GGGACTGCAA GCCGATTAAG TTTCGGTCTC GACATGTAAT ATGTACGAGT 6060 

GGTTCAATAT TGTAATATTG CTTAATTAAG CTAGCAGCAG CAATATTGCT AATTCTGACA 6120 

45 GTTGCCAATG AATTATCTGC GAGTGTTACC GCATCTACAT TAGCTTTATC AAGTTTAGCG 6180 

ATATTTTCAA AAAATCTATC CGTGTCTAAA TGTTTCGGTG TATCCAATTC GATAATAACG 6240 

GTTGGACGTT CTTGAACCTT AGATGTTAAT GATTGTCTAA CTTTATTTTG AGATGGATTG 6300 

50 AAAAGTGCTT TCGTTGGTAT CGGAATCACT TTTTTGTCAT TAACAGGTTT AAGTGTCTGA 6360 

ATAGATTCTT TAATAAA7TT GATGTGCTCT GG CGTTGTAC CACAGCAACC ACCAATTAAA 6420 
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TACTTAAATT CACTATTTTC AATATCTAAT AAGCTGGCAT TTGGATAACA AGATAAGAAT 6540 

GCGTGCTCTG GTAATTCAAT ATGTGTGAAA GACTCTTGCA TATGGTGCGG GCCATGATGA 66 00 

CAATTGAGTC CCACGATGTT TGCACCACAT TGAACGAGTT GTTTTAATCC TTCATTGATT 6660 

GCCTGACCAT TAACTAAGTA ATTTGTGTTT GAAGCGGTTA ATTGAGCAAT GATTGGAATG 6720 

TCGTATTTCT TTCTCGTTCG TGAAATGACA TTTGTTAACT CTTCTAGGTC GTAATACGTT 67 80 

TCGAAAAGTA GCGCGTCAAC GCCTTCTTCA ATTAAGGTGT CTATTTGAAT TTCAGTATGA 684 0 

TAAAGAATAG TTTGTAAGCT GATATCCTCT TGTTTGATAC CTCTAAACCC ACCAACTGTG 6900 

CCTAATATAT ACGTATCTTT ATTTGCTGCT TTTTTTGCGA TGCGAACGGC GGCTTGATGT 6960 

ATTGCTTTAA CTTTATCTTC AAGACCGAAT CGTTTTAACT TTTCAAAATT TGCACCATAA 7020 

GTATTGGTTT GAATGACATC AGCACCGGCT TCAATATATG AACGATGGAT GCGTTCAACT 7080 

20 TTATCTGGAT GGCTAAGATT ATATGCTTCT GGACAGGTGT CTAATCCTTC AGAGTATAAA 714 0 

ATGGTTCCTA TAG CGC CATC AGCTACTAAA ACATTATCTT TCAATTGTGT GAGGAATTGA 7200 

CTCATTGAAT GCCTCCTTTA ATGCGTATTT GATGTCTGCA ATGAGTTCAT CAGGATCTTC 7260 

25 GAGACCAACA CTTAATCGGA ATAGACCGAA AGTGATACCA CGTTCTTGTC TCACTTCTTC 7320 

AGGTAGTGCA GCGTGAGACA TTGTTGCTGG ATGTGAAAGG ATCGTTTCAA CACCGCCCAG 73 80 

ACTCACTGAA ACGAGTGGTA ATGTCAGTGC ATCGACAAAT TGTTGTGCTT TAG ACT CATC 744 0 

AGCTAAACGA AAGCCAATAA CGGCACCGCC ATTTTTAGCT TGTTCTAAAT GAGCAGTAGT 7500 

GAGTCCCGGA TAATAAACTT CTGAAATTTC ATCTTGCTTT ATTAAAAATG ACACGATTTT 7560 

TTGAGCGTTT TCGACAGATT GTTTAAATCT GATTGGAAAA GTTTTTAAAT GTTTAGCAAG 7620 

TGTCCAGCTA TCCTGAGCAG ATAACATATT GCCTGTACCA TTTTGTATTA AATAAAGAGC 7680 

GTCACTAATT GCCTCATTAT TAGTTATGAC AGCACCAGCA ATTAAATCGC TATGTCCACT 774 0 

TAAAAATTTT G7AGCACTAT GAATGACAAT ATCAGCGCCA AGTAATAAAG GTGATTGACc 7800 

TAACGGTGTC ATAAATGTAT TGTCCACAGC TACCAGTAGT TCATGCTTTT CGGCTATTTT 7360 

AGAAACAGCT TTGATATCAG TAATTTTAAA ACAGGGATTC GATGGTGTTT CGATATAAAT 7920 

45 TAATTTTGTG TTTGATTGAA TGGCACCCTC GATTTGTTCG AGCTTTGTAG TATCTACGGT 7 980 

TGTAAATTCA ATATTAAATC GATTCAAAAT TTGCTCAGTG AGGCGAAAAG TACCGCCATA 8040 

TACATCATCG GGTAAGATGA CATGATCACC AGATTTGAAA GTCAAAAGTA CTGCTGAAAT 8100 

50 AGCAGCAATA CCTGATGCAA AAGCAAAAGC GAATTTTCCC TGTTCTAATC GTGCTAACTT 8160 

CTCTTCTAAA AGTTCACGGT TAGGGTTGCC cTTCGTGCAT AATCATATTT AACATCGCCA 8220 
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TCCACACCTC TACGCCAATC GAATATCACT TCTGTCTCTT TTGAAAGTGT CATACAATCT 3340 

C7CCAATCTG AGCTTTATCT AATGCTTGGA TGATATCGCG TTCGATGTCT TCATAATTTT 84 00 

5 

CAACACCTAG TGATAAGCGG ATTAAATACT CATCAATGCC ACGTTTATCT TTTTCAGCAT 84 60 

CTGGCATATC AACATGTGTT TGGGTGTAAG GGAAGGTCAC TAATGTTTCA GTACCTCCTA 8520 

AACTTTCTGC AAAAATGCAA ATGTCTAAAT TTTCTAATAA TTTAGCGACG CTATAGGCCT 8580 

10 

TGTTAAGTCT TAAACTAAGC ATGCCAGTTT GCCCGCTATA TAGTACTTCG TCAATTGCTT B640 

GAAGTGACTG ACATTTTTTA GCAAGTTTTC TAGCGTTTGA TTGCGCACGC TCAATGCGTA 8700 

15 AATGCAAAGT TTTAAGTCCA CGTAACAACA AATAACTATC TATTGGTGAA AGTGTTGCGC 8760 

CAGTCATGTT GTGAAAATCA AACAACTGTT GCGCGAGTGA TTCATCTTTG ACGGTTACGA 8820 

CACCTGCTAG TACATCGTTA TGTCCGCCAA TATATTTCGT GGCTGAATGT AAGACTATAT 8 880 

20 CAGCACCTTC TGCTAGTGGT GTTGAAAGAT AAGGTGTTAA AAAAGTATTG TCGATAATTG 8 940 

ACAATAAGCC TTTAGCTTTA CAAAGTTGAT AGTATGGCTT TACATCAATA GCAATCATTT 9000 

GTGGGTTAGA TATTGGTTCA ATGAATAATG CAACTGTTTT ATCAGTGATT TCTTTTTCAA 9060 

25 CTTGTTCATA ATCTGTAAAA TCAACGTACT TAAATTTGAT ATCGTATTGT TGCTCGTAAA 9120 

ATTCAAATAA TCTAAATGTG CCACCATATA AATCGAATGA AACTAAAATT TCATCATGAG 9180 

GTTTAAATAG ATTACATATT AATTGAATGG CTGACATTCC ACTTGATGTA GCGAATGATG 9240 

30 

CAATACCATG CTCAAGTTTG GCAAAACAGG TTTCAAATGT TGAGCGTGTA GGATTTTTAG 9300 

TACGTGTATA ATCAAAACCT GTCGATTGTC CTAGTTTTGG ATGCTTGTAG GCAGTAGATA 9360 

AATGGATTGG ATTCGCTATA GCACCGGTTG AATCATCGGT TAATGTGATT TGGGCTAACT 9420 

35 

GTGTATCCTT CATATTAAGA CCCTCCTATA AGAAAAAATA AAAAAAGCTT CCGTCCTTCG 94 80 

TACCCGAATG AATCGGATAA AAAGGACGAA AGCTTATGTT TCGCGGTACC ACCTTTATTT 9540 

GTTATTCCAT CGCTGAAATA ACCTTATTCA GTACGCATTA AAAGTAAATA TGCTTACTGA 9600 

40 

ACAATTATCA CAATTAAAGT CAGTAAGTAA GGATATAGTA ATGTGCTATC CCATACTTAT 9660 

TAACAAAAAA TCGTGCGTAA AGAATCCAGT ACGCCATTTA ACATCAATGT TAATACTGTA 9720 

45 TCGCTATAAC GGGCGAACCC GTAGACACCT CATATTGGCA TCAACACTCC AAGGCCATTT 97 80 

TCAAACACGC TTTCAAAATC TTCTCTCAGC TACTAAAGAC TCTCTGTATA AGCAGGGTGT 9840 

GTTTTACTTy CCTCTTTATT GTGTTTACGT TTCATTAAAC TGTTATAAGA TATTAATTAG 9900 

50 CTTACAGAGT AAAAAAAGAT TTGTCAACAA TTATTCAGAA AATTTTGATT TAAAAGTTAA 9960 

TTTGTTTGTG AAATTGTAAT TGGTATCTTG AAGTTGAAAA ATGAATTATT TTTTAAATAA 10020 
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TCAAATAAAA AGTGATGTGA GTGAATTGTC AAAAAGTGAA GATCAACGTA TTACTAAAAC 10140 

AAAAGATGAA CAAATTAAGC AAATAGATAT ATCGGATATC AAACCGAATC CGTATCAGCC 10200 

5 ' CCGAAAAACT TTCGATGAAA ATCATTTAAA TGATTTGGCA GATTCAATTA AGCAATATGG 10260 

AATTTTGCAA CCAATTGTGC TTAGAAAAAC AGTTCAAGGT TATTACATTG TAGTTGGTGA 10320 

AAGAAGGTTT AGAGCTTCGA AAATTGCTGG TCTAAAATAC GTATCAGCGA TTATCAAAGA 10380 

10 

TTTAACAGAT GAAGATATGA TGGAACTGGC GGTCATCGAA AATTTACAAC GAGAAGACTT 10440 

AAATGCGATT GAAGAAGCTG AAAGTTATCA ACGTTTGATG ACAGATTTGA AAATTACACA 10500 

ACAAGAAGTA GCGAAACGAT TGAGTAAGTC GCGCCCGTAT ATAGCGAATA TGTTGAGGTT 10560 

15 

ATTACATTTG CCGAAAAAGA TTGCTGACAT GGTAAAAGAT GGGCGACTGA CAAGTGCACA 10620 

TGGACGAACG TTATTGGCAA TTAAAGATGA ACAACAAATG CTTAGGTTAG CGAAACGGGT 10680 

2Q TGTTAAAGAA AAGTGGAGTG TCAGATATTT AGAAAACCAT GTTAATGAAT TAAAAAATGT 10740 

TTCGTCAAAG TCGGAAACAG ACAAAGTAGA TATAACTAAG CCTAAATTTA TAAAGCAGCA 10800 

AGAACGACAG TTGCGAGAAC AGTATGGTAC CAAAGTAGAT ATATCAATAA AAAAATCGGT 10 860 

25 TGGTAAAATC TCATTTGAGT TTGATTCACA AGAAGATTTT GTGAGAATAA TTGAACAATT 10920 

AAATCGTAGG TATGGTAAAT AGTTACACAA TTTTATATAA TAACTCTTTG TGCAAGTGTA 10980 

AATAAATTGT AATCAGTGAC ATTTGATTCT AGAT 11014 

30 (2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6022 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

40 

TCCCCTTATG GAATTTCACA TTCTAGTTTA CATAATATAT ATTATAGGAA GTTATATGTG 60 

TGTAACGCAA AAgGTACCCT ACATCATAAT CATTATCTAA TATCGTCACA TAACTTACTT 120 

ATGCTATAAT CATGGTATTA TATTGTTTGG AGTGATTTGA TGAGATTTGT CTTTGATATT 180 

45 

GATGGTACGC TTTGTTTTGA CGGCCGATTA ATTGACCAGA CTATTATTGA TACATTGTTA 240 

CAATTACAAC ATGATGGTCA TGAACTTATA TTTGCATCAG CACGTCCGAT TCGTGATTTG 300 

so TTGCCAGTTT TACCATCAGT ATTTCATCAG CACACATTAA TTGGCGCAAA TGGTGCTATG 360 

ATTTCACAGC AATCAAAGAT TTCTGTTATC AAACCAATTC ATACTGATAC ATATCATCAT 420 
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GCTGCACAAC TTGACGCTGn AGAACGCGAT TTTTGAGCGT TTAGATCCAC ATAAGCTGGC 540 

CAGTTGTATT GATGTTGCAA ATATCGACAC GCCAATCAAG AkTATTTTAT TAAATATAGA 600 

CCCGGCACAA ATTACAACTA TATTAGACGA GCTAGATAAA TACCATCAAG AATTGGAAAT 660 

GATTCACCAT TCAAATGAGT ATAACATTGA TATAACAGCG CAAAATATTA ACAAATATAC 720 

TGCATTACAA TATATATTTG ATGCAGATGT TAAATATATA GCATTTGGTA ATGACCACAA 780 

TGATATTGTC ATGTTACAAC ATGCTAGTAG TGGCTATATT ATAGGACCAT CAGAAGCATA 84 0 

CACACACGCA ATATTGAAAC TTGATAAAAT CAAACACATC AATAATAATG CACAAGCTAT 900 

TTGCAAAGTC TTAAAATCAT ATAAATAAAA ACACCCCTAT CAAATGATAA TCATTATCAA 960 

TCGATAGGGG CTATTTTAAT AAAATTCGTC CTCGAACATT TCTTCCTCTT CATCTAATCC 1020 

AAATAATTCT GCCATTTCTC CATGTTCAAT TAACATGTTT AAATATGCAT CGCGGAGTTC 1080 

TTCTTCACTC ATATCATTAA TCATTTCTTT AAGACTATCA ATCCACATAT TTCTGCGTAA 1140 

TTGATAGTCT TCTTCAACTT CGTTTAACAT CATTATATGT TTATTTGCTG CTTCTGGACT 1200 

AGCTGTAAAG AGTAATGCAA TCATATGTTT ACATATCACT CGTCTTCCAT CAGCATGAGG 1260 

ACAATTACAT ATGGATTTTC TAGGATGTTC CATATCAATA TAACAACGAT ATACTTTGTT 1320 

GCCACTGCCC TTTACTTCAG CCTCATGCTG CGTTTCTGAA AATGATTTTA AGTTAATGAC 1380 

GCATTCACTT TGATAATAAT TAAAGCCTCT TTCTATAGAA CGAATACTTG CAATATCAAG 1440 

30 TAATCCCATT AATGaTACTC CTTTTTATTA TTATTTTTAA ATAAAGAaAA TAAAATAGA7 1500 

AAGTGTCTAG ATTAAAATAC TTGATTTATC TATATTTTAT AACAAGTCTA GAATTATCGC 1560 

ATTCTTAAAT aacTAATATG AAAATGcTTG CACTAATTCt TTTGTATAAG GGTGTCTATC 162 0 

35 AACATTAAAT AATTCCtCTA TTGCAAAATC ATCGACTATC ATGCCATCCT TAAGAACGAT 1680 

AATTfTATTA ACTAAGCGTT GTAACACGGA TAAATCATGA GAAATAACGA TAAAATGATT 174 0 

TAAGTTCGTA ATCGTTTGCG CTTTTAATAT ATTGATTACA TTTTGTTCAG CTATAACATC 1300 

TAAATTTGAA GTTATCTCAT CACATATTAA AACGCGAGGC TGTG CTAATA ACGAACGCAT 1860 

GACATTAAAT CTTTGTAATT GTCCGCCACT CACTTCGCTT GGTAATTTAG TCAATAATTG 1920 

CGCGTTTAAC TCAAAAGTAG ATAAATGTTG TAATAATAAT TGATCCTGAG CAGTATTATC 1930 

AGTTAGACCT CTGTAATAAT ATAACGCTTC TTTTAATGAG GTCT CAAT CG TCCAATCAGG 2040 

GTTAAAGCTA GTTAAAGGGT GTTGGAAAAT CGGTAACACA GCATTGTCAC TTAAGTAAAT 2100 

CTCTCCTTTA ACAGGTTTAA ACAAGCCAAG AACCAATGAA GCGAGCGTAC TTTTACCACA 2160 

GCCACTTTCG CCTAAAATAC CAACATTTTC TCCATCAGGT ATAGTAATAT TGATATCTTG 2220 
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CCCTCTT7AA TTGTGTTCTA TATTTAATTA 
ACTTGAAATG ATTAATATTA CCTCGTTCAA 

5 ACTGACAATA TTTCAATACA TGACTTAAGT 

GTTCTAATAC AATATGCTGT AATAAATCCA 
ATGCAACTGG TTCGTCTGCA ATGATTAATT 

10 ATACGCGTTC AAGTTGGCCC CCAGAAAGTT 

TTTGTAAATT AACCCACGAC AAAGCCTTAT 
CTTTATAATG CTTACGATAA ATCGCAGTTA 

75 AACTTTCTGC ATAATTTTGA GAAATATAGC 

TACTAACATT TTCCCCATCA AATTGGTACG 
AATATTCAAG TAAAGCTTTA GCAATCAAAC 

20 

CATTAATCTG TTGACTAAAA ATTTTCAAAT 
TATTCTTTAT TGTTAAATTT TGTATATCAA 
AGCAATCTAT CTCTTAGTGC ATCACCGGTT 

25 

ACTGAAGCAG GTGCAATCAA CATAATTGGA 
AACATAGCGC CCCaCTCTGG TGTTGGCGGT 
CTTATATATA GAATGATTTT ACCGAAATCA 

30 

ATTTTAGGTG TTAAATGACG TATTAATATT 
A TTTTTA TAT AAGGCTTATT CATTTCGCTA 

35 TTCATCCATT TTATTAATGT AATTGAGATA 

cttg£taaag CAATCATGAT GATAAATTCT 
AACACTAATC GTTCAATCCA CCCTTTTTTG 

40 ATGACAACGA TAGCTATTAA TGTTAAAACA 

ATAATTCGGG TAAATAAATC TCTCCCATAA 
ATAGGTTCAA AAGTTTGTGA TAAATTGACT 

45 TGCAGTACAA TTACCACAAA AATAAATGCA 
GAAAATATTT TATGCATGAC GGTCACTACT 
TTTTGGATTT CCTAATTGTA AACGCTGCTT 

SO 

AATCGTATTG ATAATAACAA CGAAGAAGCC 
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GACGTTCAGT ATACGGATGC AAATGCTCAT 2340 

TGATTTGACC TTCTTTTAAA ACATAAATGT 240 0 

TATGTGTGAT AATAAATAAT GTTTGACCAT 24 60 

TCACTTGATT ACCGTTCAAA GCATCCAATG 2520 

TAGGCTCCAA CATGAGAACG CTTGCTATGT 2530 

GGAAACTATA TTTATTTAAT ATATCTTTGC 2640 

CAACTTTGGA CAAAGCCTCT TCTTTACTAC 2700 

ACTGTTTACC TAATTTAG7A TGGTCGTTAA 2760 

CAATTGTATG ACCATAATAT TGACTCAATC 2820 

AATCATACGT GCAGCTTAAA TCAAATGGTA 28 80 

TTTTTCCAGC GCCGCTCTCT CCAATCAAGG 294 0 

CAATCCCTTT AATAAGAGAT TTCTCACTAG 3000 

TGAGACTCAT CATATTCACC CCGTTGTTTC 3060 

AAATTAAAAA TTAAAATAGT TATAGCAATG 3120 

TGAGACGAAA TAAAATCACG ACCTTGTTGC 3180 

TGTGCACCTA ACCCAATAAA TGATAGTGAA 3240 

ACGACCATCA AAACGATAAT AGCCGGTATA 3300 

GTTCTTGTTG GTACATGAAA TAATTGTGCC 33 60 

TTAACTATAC TTCTAGTCAA CCTTGTGTAA 3420 

ACTAAATTCC ATAAAGATGG TTGAAAAAAA 34 8 0 

GGAATACTTA GACCAACATC AATAAACCTT 354 0 

TATCCGGCAA ATAGACCTAG TGTAACACCT 36 00 

GTAACAAACA ATGTTGAACG TGCACCGATA 3660 

TCATCAGTTC CTAATAAATG CAACCAACTA 3720 

TTGGTTGCAT TTTCACTACT GACAAAGAAT 3780 

ACGAATACAA AAAATATCAG GTTATTCTTT 3840 

TTCTGATATC AATGGTGTAT TGGTTTTGAT 3900 

CGGATCAAGT AATAACGTTA ATAAATCAGC 396 0 

AATAAATAAC ACGCATCCTT GAATAACAGG 4020 



55 



571 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



ATTTTCAATC ACTACAGTAC CACCTATTAG ACTGCCAAGT GAAATCCCTA GTAATGGGAT 
AATCGGCAAA ATTGTTGGTT TTAGTAAATC ATGAATTAAA ATATAACGTT CATTCATACC 
GCGTAATCTT GATGCTTGTA CGATATTACT TTGCAATAAC ATCAATAAAT TAGAACGCAC 
TAAACGAATG ATGTATGCAC ACATACCTAA AGATAGCGTG ATTACAGGTA ATATAAACTG 
ACTTAGTATA ACGCTATCTA TATTCATTAA ATTTGTGACA ATAAATAATA AAATAATACC 
GATAAAGAAC GCTGGTAAAC TAATCGATAG TGTTGAGATC ACTCTAATCA CTTTATCCGT 
CCACTTATGA AATCGTTTGG CTGCTATAAT GCCGAGCGGT ATAGATATGC ATAACGACAC 
TACTAATGTT GAAAATGATA TGAGTAATGT TATGGGTGCA TAGTTGAATA ATATCTGTGT 
TACCGGTTCT TTTGATTCAA AACTTTTTCC TAAATTAAAA TGTAATAAAT GATTCATCCA 
ATGCCACCAC TGTACCAATA AAGAATCATT TAATCCCAAT TTATCTTTGG TTGCATTTAT 
TTGTTCCGTC GACACTTGTG CTACATCAAG ATGTAATATT TTATCAACAG GATTGCCTGG 
TGATAATTTC ATTAAAATGA ATGTAAGTGT AGAAATAACA AATAAAACAA CTATCATTTG 
CATCAGTCTA TACAACATAG ACTTTATTAT GAACATAATA GTCCCCCTCC TTGTGTAAGT 
TACTAACACT TTCTTTTTAC ATGAGAATGG CGCATGTATA TGCAACTTAC ATATTAAGAA 
CTAACGTTCA TTATAGTATT ATCCATAAAG AAATTGAAGT ATATTTAATT TTTTAACAAA 
ATCATTATAA AATATAATAT TTTGAATCAA GTCAACCATG TAAAATATAA AAAAGTCAAA 
ACAAAAACAA CTATAGCACT GTATTCCATC TCTTTCGAAA TAATTGTTAC TGCAGTGTAA 
CTTAAAAGTC GATGATTTTG TGCATATAGT TGTCGAATAT TATTTTTTAT CTTTACGGCG 
AAGTTCAGCG CCCTCATAGC CGTATTTTTC AATTTGCTTT TCTAATTTAC GCGCTTTTCT 
TTCTTTACGC CAATTTCTAG TAAAATACCA TAATAGAAAA CTAATTAATA AACTCATAAT 
CGCTOAAAAT GCAGCGTATC CTAATAATGG TTGATATTTT ATATCTTGAA AATTTGGAAT 
AAAAAATGCA AGCACACCTA ATATAACAAA TGTAATTACT GCAGATACAA ACCATTTATT 
TAAAACTAAG CAACAGAATA TTGTTAATAA AATCATTATT AATGTTGTGA TCCATAAATA 
ATTAGGCATA TCGAATAATG TCATATTCAT TCTCCTTTTA TTTCATTACT TTCCTTGTAT 
ACATTTTATT ATAAATTTTT AAAAACTTAA ACAATAGCAG TCAGTTTCAA GCAATATTCT 
ATCTACTAAT AGAAAAATCA TTGTTCCTTG CGACATGGAA ATCGTAACAT TATCGTTTAG 
GAGACAAAAT TATGTATAAT GAATGTATTA TACCAAAGGA GTGATTATAT GTCTCAAGGT 
TTACCTTTAA GAGAAGATGT TCCTGTTTCA GAAACATGGG ATTTAGTAGA CTTATTTAAA 
GATGATCAAC AATATTATGA AAGTATTGAC GCTCTAGTAC AnCAAGCAAA TCAATTTCAT 
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GAAAATATTT TAATTGCCTT AGATCGCTTA AGTAATTATG CAGAACTACG TTTAAGTGTA 5940 

GATACTAGTA ATATCGAGGC ACAAGTATTG AGCGCTAAAT TATCTACTAC ATACGGTAAA . 6000 

ATTGTTAAGC CAATTATCCT TT 6022 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 base pairs 
(3) TYPE: nucleic acid 
(C) STRAND EDNESS : double 
tD) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

CCATCAATAA TGTATACATG ATTGGCATCA TATTCCCCTT TAATTAGAGA GCTACGTACA 60 

GTTTGTyTTA TTAAAGTAGA ACTAATAAAT AACCATCTCT TATGTGCACA AACACTTCCC 120 

GCAACAATTG ATTCAGTTTT ACCAACCCGT GGCATACCTC TAATGCCAAT CAACTTATGA 180 

CCTTCTTCTT TGAACAATTC AGCTAAAAAG TCTACTAACA AGCCTAAATC TTCACGCTCA 240 

AATCGAAAGG TTTTCTTATC TTTTGCATCT TGCTCAATAT ATCTTCCATG TCTTACTGCA 300 

AGACGGTCTC TTAATTCTGG TTTTTTAAGC TTTGTTATTT CAATTTCATT TATACCACGA 360 

GCTATTTGCT CAAAACGTTC AAC TT TTTCA AGATTGTCTG TTTTAATTAA AAGGCCTCGT 420 

TTACCTTGAT CAACACCATT AATTGTAACA ATACTTATAC CTAACATACC TAATAA 476 
(2) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3633 base pairs 

35 (b) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

~ (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

AGAAATACAA CGAAGCATAT AAATATAACC GATCTTTTTT CTAATTGAAT ATTAAGTAAG 60 

TGTATGTACT TTCTGGAAGT AGCACCTAGT rGGATTGTtC CTCCTACAAC AGGCCAAAAA 120 

TTTTTATTTT TAACTGGCTT AACAGTGTTC AGTTTTTCAT ACTCTTCTCT ACTAATTTTG 180 

GCGCACCTTT TTGGAATGAA CCAATTAATA AATGGAAAAA AGTATACAAG CCAAGTTCTT 240 

ATTACATCGA CCATTAAATA CTCATCATCA TACTTAATAA CTCTGTATTT CGGATTTTTA 300 

TTGATAATTT CGGTTTCACA AAGCAATAAT TATCACTTCC TATTAATAAC AAATTCACAC 360 
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TTATATGACC TTAAATATAT AACATGAATC TTTTTGTCTA TTATTGAAGA CATATTTATA 
AAGAAAAATA GCATTGTCAT AATAACCCAA GCAATAAATA CTATAATATT TTGGATAGAT 
AAACTAATCA rrACATCTAA GAACATGATT gATAATCCAC CACAGAAAAA ATAAGAAAAT 
AGTACAAAGC AAAGATTCTT GAATGATGGA AAAATCATAA TTTTTCCATT GCTACTCCGA 
TCATTATAGA TAGATAACTT TACTTTCTGA TTTAAATATA TATAAAACAC TAGAATACTT 
AATAATAAAA CCGAACAAAT GATAATAACG CAATTTTTTT CTAAATGAGA ATCAGGTATA 
TATATTTTAT CTCTAAACAT AGTGCCAAAT AAAAGTATGC TACCTATAGC TGGCCATAAA 
15 GCTTTaTTTT TAACTGGTTT GACAATATTT AAATTATCAA AATCTTCTCT GCTGATTTGG 
ACATATTTTT TTGGTATTAA CCAATTAATA AACGGAAAGA ACAAAACTAA CCAGGTGCTT 
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20 



25 



ACTAAATCAA TCATCAGATA GTCGTTTTTA TATTTAATAA TTCTATATCT GGGATTTTTG 
TTTACAACTC TAACCTCGCA AAGCAATATC TCCACTTCCG TCTCGTTGGT TTTATATCTA 
ATACACTTTC AGATACTTTA TAAGTGTTTT GTATTTTAGT AACATACTAT TTTCCTGTTT 
ATTACTTAAC TTACGAACTA CAATCTAAGT TTAGTAATTT CTATTGCTTT TTAAGTTTGG 
CATAAACCTT TTTATTACTA ATTGAGCCCA TGCTTATTAG AAAGAAAAAA ATTGTAATAA 
TAATCCACAT AATAAATACC AGTAGATTTT GAGGTTTTAT AGTCATTAGC CATATTAAAA 
ATAATATAGA ACAACCTCCT AATAATAGAT ATGTGAAAAC TATAAAACTT CCATCTTTAA 
AAGTAGGCAC TAATATAACC CTATTTTCAT TATCTAGATT ATCATCATAT ATCTTTAGTT 
TAAGCTTTTT ATTTAAGTAA ATGTAAAATG CTGCAATACC TATAAATCCT ATAAAACATA 
35 AAGATATTAA AATCTTATTA TCTAATTGAA CTTCAAACGT ATGTACATAT TTCCGTAAAA 
TAACTACAAA TAAAAACGAA CTACCAGTAA CTGGCCAGAA AATATTATTT TTATTTTGTT 
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TATCAACATT TAAATTTTCA AGTTCCTTCT CACTAAGTTT TGCATACCTT TTGGGAATGA 
ACCAATTAAT AAAAGGAAAA AAGTATACAA GCCAAGTGCT TACTAAATCA ATTAACAAAT 
ACTCATCATT ATATTGAACG ACTTTATATC TCGGATTTTT ATTAATAACC TTAATATTAA 
AAAGCAAAAC TCACCACGCC CATTTCATTG GATTTATATG ATTGCTAATA ATATTTTTAG 
CTTCACTAAC AGCATTCCCA ACACTATCCA TGGATTTTTC TGTAGTT7TT TTAACAACAT 
CTATACTATT ATCGATTTTA TGCCCTACCC AGTCTACTTT ATCTTTTAAT CCAAAAATAT 
SO TATTTTGATA AATTAAATCT GTTCCTAATG CAAATACTGT ACTCATAGCC AAACCTGCTA 
AAATCACCCA TCCTACTGGA TTACTTCCTA AAACAAAAGT CGCTAATCCA GCTCCAACTG 
CTGTCCCTGC AGATCCAGCT GCAAGCGTgC ATACCATTAT GCGACAACGC CTCTCCAAAT 
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CCTTTACCTA GGTATTTTCC GCCTTTTGCA AATTTACTAC CATTTTCTAT AAACACATTA 22 80 

CCTGATGTAC GTTTGACTTC CACAAATGAA TTTGGACCTG CTGGGCCTTT CACTCCACCT 234 0 

5 GCTGTATTGa TAAATACACC GAATTTACTT GcATTTATAC CGTCTTGCTC TAAAAGTGTT 2400 

GACGTAATAT CTAATCCTAT ATCTCTTTTA ATACTGTCTT TATTGTCATT TATATATTTC 2460 

AATATACTTT TCGGGATATC GTCTTCTGGA TGTTCTTTGG CATATGCCTT TATAACAGCA 2520 

10 

AAGTCTGCTT TATTTAAAGT TTCTTTCTCT GCTTTATGTT CAATTTTCCC CATAGCAACT 2560 

TTCAAATATT TT7CATGACT TGCTTTGGCC CAATCAAGTT CTTTACCTGA AGGAATATTA 2640 

AATTGATTTG TTGAAAAGTT CCAAAAATTC TGCGCTTGGG TAAGTCCTTG TTGGACAATT 2700 

15 

TTTTGAAATT CTTCAACTTC TTTAAATATT TCTGGTGATT TTTGATTAAA CTCACGCAAT 2760 

TTGCGTAGCT TCTCTTCTAA TTCATGTTTT TGTTGACCTA ATGTTCGTAT TATTTGTTGG 2820 

20 TTCGATGAAA TGGCTTGCTG ATTATCGGAA GCATGCTTT7 TCAAATTGTT ATTCAAATTT 2880 

TCATATCGCG TAATTTGTTG ACTTAATGAT CTGATATCTT CTTCAAGCTC TGATTCTTTT 2 940 

AAAGATATGC TATCAACCTC ACTCGTATAA CGTGACACAA AATTaTCGCA AGCTTGCTTC 3000 

25 

GTTAAATCAC TCAATGTTTT CATACTTGTT GATAATGGAA TTAACACCGT ACTAAAAAAT 3060 

TGCTTAGCTG ACGTATACGC TTTCCCTTTA AGCGCATCAT CATTAATAAA TTGAGTAATT 3X20 

GCTTTTTCCA ACGCATCATA ATTTGAATTC ATTGTTTGAC TCAAATTCCC CACACTTGAA 3180 

30 

GCTTGGTTTC GAGATCTGTC TAAATACATG TCAATACTCA TCGGCATGCT CCTTTTTCAA 3240 

AAATATATGA TTTTCAAACT ATTTAAAATC AAATGCTTTT TACATCTACA AAGTTGTAAA 33 00 

35 ATTTTAAAAC TCGGCGATGA 7TATTTCTTA TGTAAAGGAG TCTAGATGCA GGTAAATTGA 33 60. 

GATAACATGT CGCCTTTTTT CTTATTTTAG CATATGGATA TAATGGTGTC TTTGTATATT 3420 

CGCAATTAAT CAATAAAAAT 7ATCTTTCAA TATTTTAATT TTATTGCGAC AACATCCTTA 34 80 

40 ACATTAAATA TATTAATATC TCAAAATATA TTCACTATTA AAATATGTCA TCAGTTGTTA 3540 

AAAGTATTTC CTCATCATGC GAAATATCAA AACGTATCTA AAATACGAAT AAGTTTATAC 3600 

AATCACACAA CATCATCATT CAAAATTTTA TTG 3633 

45 

(2) INFORMATION FOR SEQ ID NO: 95: 
(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2365 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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TGATACGAAt GCATTACAAT TCATATGCAA CATACAATTC CTTCTACAGC AAATGAAGTG 60 

AAACAAATAG TTGATGTGAC ATCTGTAGCA GAAAATGATA CGCATTAGTC ATAAAATTAA 120 

ATGGAAATGT CGATGAAGTG TATCAGCAAT TACAGCGATT AATTAAGAAT GCTAATGTCG 130 

AAGAGAGTGA GAATACTGAC AATATTAATA GTCAAGATAC AAGTTATACA CCTCAAGTAA 240 

AAGTAACAAC ACCAATTTTA GTGAAAGCAC CAATCGCTGG TCGTCGTATT TTACTTAAAG 300 

AAGTAAGAGA TTCAATTTTT AGAGAGAAAA TGGTAGGTGA AGGCTTAGCA ATCAAAGCTC 360 

ATGAAGAATC CAAAGTAATC GCACCGTTCA ATGGTTTAAT ATCTATGATT GTACCAACTA 420 

AGCATGCAGT TGGTATTCAA TCAGAAGACG GTGTGGACAT AGTCATTCAT ATTGGCGTGA 480 

ATACAGTTGA CTTGGAAGGT AAAGGGTTCA AGTGCTTTGT AAAGCAAAAT GATCATGTTG 540 

AAGCAGGGCA AACGTTGTTG CAATTCGACC AGCAATATAT ACAACAACAA GGCTACAATG 600 

20 CTGACGTTAT TGTCGTTATT AGCAACTCTG CCGATTTAGG AAAAGTAGAA CTGACAATGA 660 

ATGAAATCAT TACGACTGAA GATGTTATTT TTAAAATATT TAAAAACTAG GAGTGTGTTG 720 

TAATAATGAC AAAATTACCG CAAAATTTCA TGTGGGGTGG CGCTCTTGCC GCAAATCAAT 780 

25 TTGAAGGTGG ATATGATAAA GGTGGTAAAG GGTTAAGTGT AATTGATGTT ATGACGAGTG 84 0 

GTGCACATGG CAAAGCACGT CAGATTACAG AATCTATAGA TCCCAATCAC TATTATCCAA 900 

ATCATGAAGG TATTGATTTT TATCATCGTT ATAAGGAAGA TATTGCCTTG TTTAAAGAAA 960 

TGGGATTGAA ATGTTTACGT ACGTCGATTG CGTGGACACG TATCTTTCCG AATGGGGATG 1020 

AAGATGTGCC AAATGAAGAA GGACTCGCCT TTTATGATCG TATCTTTGAT GAATTAATTG 1080 

CACAAGGTAT TGAACCTGTT GTGACGTTAT CACATTTTGA GATGCCACTT CATTTAGCGA 1140 

AACATTATGG TGGATTTAGA AATAGAGAAG TTGTCGATTA TTTTGTGCAT TTTGCGCGTG 1200 

TTGTATTTGA AAGATATAAA GATAAAGTTA CATATTGGAT GACGTTTAAT GAAATTAATA 1260 

40 ATCAGATGGA CACATCAAAT CCTATCTTTT TATGGACGAA TTCTGGGGTA GCATTGACAG 1320 

AAAATGATAA TCCTGAAGAA GTCyTGTATC AAGTAGCACA TCATGAACTT TTAGCCAGTG 1330 

CyTTAGCAGT TCGTCTTGGT AAAGaGATtA ATCCgAaGTT TAAGATTGGr ACmATGATTt 144 0 

CAsnaTGTACC CmTTTATCCa TAwTCGTGTC ATCCGAAAGA TATGATGGAA GCACAAATTG 1500 

CGAATCGCTT ACGTTTCTTT TTCCCGGATG TCCAAGTGAG AGGTTATTAT CCAAGCTATG 1560 

CTAAAAAAAT GTTGGCACGA AAAGGATATG ATGTTGGATG GCAAGAAGGG GACGACAGTA 1620 

TTTTACAGCA GGGCACGGTT GATTATATTG GCTTTAGTTA TTACATGTCT ACGGCTGTAA 1680 

AACATGATGT TGATACTACA GTTGAAAACA ACATCGTCAA CGGTGGTTTG AATCATTCTG 1740 
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GATATACATT GAATGTGTTA TATGATCGTT ATCAGTTACC ACTTTTTATT GTGGAAAATG 
GTTTTGGTGC AGTTGATGAA GTGGTAGATG GACATATTCa TGATGATTAT CGCATTGAAT 
ATTTAAAAGC ACATATTACA GCAGCGATAG AAGCAGTTGA TCAAGATGGT GTAGATTTAA 
TCGGTTATAC ACCGTGGGGA ATCATTGATA TTGTTTCATT TACAACCGGT GAAATGAAGA 
AACGCTATGG TTTAATATAT GTTGATCGAG ATAATGATGG TCATGGCACG ATGGAACGCT 
TGAAAAAAGA TTCGTTCTAT TGGTATCAAC AAGTGATAGC ATCAAATGGA GATAAATTAT 
AAAGGTATAT TATAAGTATT TTAGGGTTAG AGCCCGAGAC ATAAATTAAT ATAGTAGGAC 
CTACAGTGTT ATAATGGCGG gCCCCCAACA CAAAGAATTT CGAAAAGAAA TTCtAcAGGT 
aATGCaAGtT GGCGGGGcCC AACACAGAGA AATTCGAAAA GAAATTCTAc AGGTAATGCA 
AGTTGGGGAA GGACAGAAAT AAATT 
(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11050 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



CTGCGATACG 


ATTTGTTGAA 


AGTGGGGAAA ACAAAAAAGT 


TATCATTACC 


AATTTAGAGC 


AGGCATACGA 


AGCTTTGATT 


GGTAATAAAG 


GTACACACAT 


TCACATGTAG 


CACTTTATCA 


CGCGACAAAA 


CATTAAATAT 


GTTTCTCCGT 


TGATTCAAAT 


GAAAAAGTTG 


TCTGCTGACA 


CTTTGCAAGG 


TTTGAAGGAG 


TTTAACTTAT 


GACAGAAAAC 


TTTATTTTGG 


GTAGAAATAA 


TAAATTAGAA 

• 


CATGAACTAA 


AGGCATTAGC 


AGATTACATT 


AATATACCAT 


ATAGTATATT 


ACAACCATAT 


CAAAGTGAAT 


GTTTTGTCAG 


ACATTATACG 


AAAGGCCAAG 


TTATTTATTT 


TTCGCCACAA 


GAAAGTAGCA 


ATATTTACTT 


TTTAATTGAA 


GGTAACATTA 


TTAGAGAACA 


TTACAATCAA 


AATGGAGATG 


TATATCGTTA 


•iTri'AATAAA 


GAGCAAGTAT 


TATTTCCAAT 


CAGTAACTTA 


TTTCATCCGA 


AAGAGGTTAA 


CGAATTGTGT 


ACAGCATTAA 


CCGATTGTAC 


AGTTdTGGA 


TTGCCTAGAG 


AATTGATGGC 


CTTTTTGTGC 


AAAGCTAATG 


ATGATATATT 


TTTGACACTT 


TTTGCATTAA 


TAAATGATAA 


TGAGCAGCAA 


CACATGAACT ATAACATGGC 


ATTAACAAGT 


AAATTTGCTA 


AAGATCGAAT 


TATCAAATTG 


ATATGCCATC 


TATGTCAGAC 


AGTAGGATAC 


GATCAAGATG 


AATTTTATGA 


AATCAAACAG 


rmTAACTA 


TTCAACtCAT 
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TGAAAAACTT GTTGTTAAAG ATCATAAAAA TTGGTTAGTA AGCAAACATT TATTCAATGA 
TGTATGTGTT TAATATACAA TGTAAAATGA ATAAGTTGAA CATGAGGTCT AACGTACATT 
TATACGTTAG GCCTTTTTTG CTAGCATGAT GAATAATTTA AAATGTTAGT TAAATTTGAT 
TGTTGAAATT ACAGTAAAAT TTAAGGTGAT GAAAAATTTA GAACTTCTAA GTTTTTGAAA 
AGTAAAAAAT TTGTAATAGT GTAAAAATAG TATATTGATT TTTGCTAGTT AACAGAaAAT 
TTTAAGTTAT ATAAATAGGA AGAAAACAAA TTTTACGTAA TTTTTTTCGA AAAGCAATTG 
ATATAATTCT TATTTCATTA TACAATTTAG ACTAATCTAG AAATTGAAAT GGAGTAATAT 
TTTTGAAAAA AAGAATTGAT TATTTGTCGA ATAAGCAGAA TAAGTATTCG ATTAGACGTT 
TTACAGTAGG TACCACATCA GTAATAGTAG GGGCAACTAT ACTATTTGGG ATAGGCAATC 
ATCAAGCACA AGCTTCAGAA CAATCGAACG ATACAACGCA ATCTTCGAAA AATAATGCAA 
GTGCAGATTC CGAAAAAAAC AATATGATAG AAACACCTCA ATTAAATACA ACGGCTAATG 
ATACATCTGA TATTAGTGCA AACACAAACA GTGCGAATGT AGATAGCACA ACAAAACCAA 
TGTCTACACA AACGAGCAAT ACCACTACAA CAGAGCCAGC TTCAACAAAT GAAACACCTC 
AACCGACGGC AATTAAAAAT CAAGCAACTG CTGCAAAAAT GCAAGATCAA ACTGTTCCTC 
AAGAAGCAAA TTCTCAAGTA GATAATAAAA CAACGAATGA TGCTAATAGC ATAGCAACAA 
ACAGTGAGCT TAAAAATTCT CAAACATTAG ATTTACCACA ATCATCACCA CAAACGATTT 
CCAATGCGCA AGGAACTAGT AAACCAAGTG TTAGAACGAG AG CTGTACGT AGTTTAGCTG 
TTGCTGAACC GGTAGTAAAT GCTGCTGATG CTAAAGGTAC AAATGTAAAT GATAAAGTTA 
CGGCAAGTAA TTTCAAGTTA GAAAAGACTA CATTTGACCC TAATCAAAGT GGTAACACAT 
TTATGGCGGC AAATTTTACA GTGACAGATA AAGTGAAATC AGGGGATTAT TTTACAGCGA 
aGTTACCAGA TAGTTTAACT GGTAATGGAG ACGTGGATTA TTCTAATTCA AATAATACGA 
TGCCAATTGC AGACATTAAA AGTACGAATG GCGATGTTGT AGCTAAAGCA ACATATGATA 
TCTTGACTAA GACGTATACA TTTGTCTTTA CAGATTATGT AAATAATAAA GAAAATATTA 
ACGGACAATT TTCATTACCT TTATTTACAG ACCGAGCAAA GGCACCTAAA TCAGGAACAT 
ATGATGCGAA TATTAATATT GCGGATGAAA TGTTTAATAA TAAAATTACT TATAACTATA 
GTTCGCCAAT TGCAGGAATT GATAAACCAA ATGGCGCGAA CATTTCTTCT CAAATTATTG 
GTGTAGATAC AGCTTCAGGT CAAAACACAT ACAAGCAAAC AGTATTTGTT AACCCTAAGC 
AACGAGTTTT AGGTAATACG TGGGTGTATA TTAAAGGCTA CCAAGATAAA ATCGAAGAAA 

m 

GTAGCGGTAA AGTAAGTGCT ACAGATACAA AACTGAGAAT TTTTGAAGTG AATGATACAT 
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ACCAATTTAA AAATAGAATC TATTATGAGC ATCCAAATGT AGCTAGTATT AAATTTGGTG 
ATATTACTAA AACATATGTA GTATTAGTAG AAGGGCATTA CGACAATACA GGTAAGAACT 
TAAAAACTCA GGTTATTCAA GAAAATGTTG ATCCTGTAAC AAATAGAGAC TACAGTATTT 
TCGGTTGGAA TAATGAGAAT GTTGTACGTT ATGGTGGTGG AAGTGCTGAT GGTGATTCAG 
CAGTAAATCC GAAAGACCCA ACTCCAGGGC CGCCGGTTGA CCCAGAACCA AGTCCAGACC 
CAGAACCAGA ACCAACGCCA GATCCAGAAC CAAGTCCAGA CCCAGAACCG GAACCAAGCC 
CAGACCCGGA TCCGGATTCG GATTCAGACA GTGACTCAGG CTCAGACAGC GACTCAGGTT 
CAGATAGCGA CTCAGAATCA GATAGCGATT CGGATTCAGA CAGTGATTCA GATTCAGACA 
GCGACTCAGA ATCAGATAGC GACTCAGAAT CAGATAGTGA GTCAGATTCA GACAGTGACT 
CGGACTCAGA CAGTGATTCA GACTCAGATA GCGATTCAGA CTCAGATAGC GATTCAGACT 
CAGACAGCGA TTCAGATTCA GACAGCGACT CAGATTCAGA CAGCGACTCA GACTCAGATA 
GCGACTCAGA CTCAGACAGC GACTCAGATT CAGATAGCGA TTCAGACTCA GACAGCGACT 
CAGACTCAGA CAGCGACTCA GACTCAGATA GCGACTCAGA TTCAGATAGC GATTCAGACT 
CAGACAGCGA CTCAGATTCA GATAGCGATT CGGACTCAGA CAGCGATTCA GATTCAGACA 
GCGACTCAGA CTCGGATAGC GATTCAGATT CAGATAGCGA TTCGGATTCA GACAGTGATT 
CAGATTCAGA CAGCGACTCA GACT CGGAT A GCGACTCAGA CTCAGACAGC GATTCAGACT 
CAGATAGCGA CTCAGACTCG GATAGCGACT CGGATTCAGA TAGCGACTCA GACTCAGATA 
GTGACTCCGA TTCAAGAGTT ACACCACCAA ATAATGAACA GAAAGCACCA TCAAATCCTA 
AAGGTGAAGT AAACCATTCT AATAAGGTAT CAAAACAACA CAAAACTGAT GCTTTACCAG 
AAACAGGAGA TAAGAGCGAA AACACAAATG CAACTTTATT TGGTGCAATG ATGGCATTAT 
TAGGATCATT ACTATTGTTT AGAAAACGCA AGCAAGATCA TAAAGAAAAA GCGTAAATAC 
TTTTTTAGGC CGAATACATT TGTATTCGGT TTTTTTGTTG AAAATGATTT TAAAGTGAAT 
TGATTAAGCG TAAAATGTTG ATAAAGTAGA ATTAGAAAGG GGTCATGACG TATGGCTTAT 
ATTTCATTAA ACTATCATTC ACCAACAATT GGTATGCATC AAAATTTGAC AGTCATTTTA 
CCGGAAGATC AAAGCTTCTT TAATAGCGAT ACAACTGTTA AACCATTAAA AACTTTAATG 
TTGTTACATG GATTATCAAG TGATGAAACG ACATATATGA GATATACAAG CATAGAAAGG 
TATGCGAATG AACACAAATT AGCTGTGATT ATGCCCAATG TGGATCATAG CGCATATGCT 
AACATGGCAT ATGGTCATAG CTATTATGAT TATATTTTGG AAGTGTATGA TTATGTTCAT 
CAAATATTTC CACTTTCCAA AAAGCGTGAT GACAATTTTA TAGCAGGTCA CTCTATGGGA 



579 



TTATCTGCTG TGTTTGAAGC GCAAAATTTA 
GAGGCCATAA TTGGCAATCT TTCAAGTGTT 
5 CTAGACAAAG CTGTAGCTGA AGATAAACAA 

CAAGACTTTT TATATCAAGA CAACTTAGAT 
CCTTATCAAT TTGAAGATGG ACCAGGAGAT 

10 

AAGCGTGCTA TAACATGGAT GGTGAATGAT 
TTAAATACAC AGAGTGAGAG ATACAAACTA 
AAATTATTTT TGTATTAATA TGATTGGCGC 

75 

GAAACTTAGA TTTAGCTTAT AGTTTTATCA 
TTATAATGAG GTTAACGCTT TGAAAGGAGT 

20 TGAGCATATG TTGTTTTATT TTGCATATAA 

AGAGAAGTAT GGTATGAGTC GTCAGCATCA 
TGGTATTACT ATTAAATCAT TACTAGAAAT 

25 AACACTTCAA AAATTAAAAG AGCAAGGTCT 

ACGTGTCAAA AAATTATATT CGACGGATAA 
GGCGCAAGAT GAATTATTGC AAAATATATA 

30 

GATGGAAGCA TTGGCTAAAG GgCGACCTGG 
AAAAGAAAGC TAGCATCAGA AATGTTAAAA 
TCAAAAAGTG TATAATAAAA ACATATAATT 

35 

AAGGAGTTTG AATGATGAAA AAATTAGCAG 
TCGCATTTAA AAAATACCAA GAACGTGTTA 

40 AACCATAAAA AATTCCCGAA CACCTTGTTA 
TGAATATATC AAATATTATT TTTGCGCTTT 
CTGATCTAGG TCCGTAAGCG TAgGTATTAA 

45 ACCCTGTATA AG ATT TAT CA TTTACTGGCT 
GCGTTTCTAC TTCTGCGGAT TTTTCGTCTT 

TATTCTCTTT TTTAACCTTT TTCATATCAT 

SO 

TAATAACTTT TTCAGGGTCT TCACCTTTAG 

GTTTCATAGA TTTAATCGGT TGAGGATTCC 
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ATGGATCTAG AGTGGAATGA TTTTTCAAAA 4 500 

AAAGGAACTG AACATGATCC GTATTACTTG 4 560 

ATTCCAAAAT TGCTCATTAT GTGTGGTAAA 4620 

TTTATCGATT ATTTATCACG CATAAATGTT 4680 

CATGATTATG CATATTGGGA TCAAGCGATT 4740 

TAATTATTTC TTGGAAAATA TGTGGCTGCA 4800 

TTTACGCACG ACTAACATTT CTAAGTGTTT 4860 

AATTTGCTGA TACACAAAAA TGTTTCTCGT 4920 

TCATTTGTAT GACTTACATT ATAAATTTTA 4 980 

CATCATCATG TCGACCAATA AAAACGATTA 5040 

AACCTTTATT ACTACCGCTG ATGAAATTAT 5100 

TCGTTTTTTG TTTTTTATCA ATAAATTACC 5160 

ATTAGAAATT TCTAAmCAAG GATCACATGC 5220 

CATTATTGAA AAAGTTTTAG AGACTGATCG 5280 

AGGCGATCAA CTCATTGCTG AATTGAACAA 5340 

TCAACAAGTC GGTTCGGATT GGTATGATGT 5400 

cTTTGATTTT ATTAAGCATT TGAAAGATGA 5460 

ATCTTCGCAT TCTTAAATTT AAAAAATATG 5520 

TAATTGAACT CAGTTTCAAC ACATCTTAGA 5580 

TTATTTTAAC ATTAGTTGGC GGTTTATACT 5640 

ACCAAGCACC TAACATTGAG TACTAAATTA 5700 

TAGTGCTCGG GAATTTTTTT ATGCTTTACT 5760 

CTGTATTTTC GATATTACCA CTAAATGATT 5820 

CATCCTCGCC TGTATGTCCA TCGGAAGTCC 5880 

TCTGAATAGC GTGTTGTAGG GCTTTTGTTT 5940 

TTTCTTTTTT AAGTAGTCTT TTTAGCTTTT 600 0 

CTTGTGAAAA TTCAAATCCA TAACCTTCAT 6060 

CCATTTTTTC TGTCATATAT GATCCAGAGT 6120 

ATTCGTATCC TTTATCTTTA CCAATTGTTA 6180 
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ATTGAATGGC GTCATCGAAT GCTTTTTCAA AACCTTCCAT TTCAGACATA ACGCCTGTAA 
TATCGTTGGA ATGCGCTGAT TTATCTATAG AAGCACCTTC GACCATTAAA AAGAATCCTT 
TTTTATTGCG CTCAAGCTTA CTAAGTGCAC TTTGTTGCAT ATCAGCTAAT GATGGTTCGT 
CTTTAGAAGC ATCTATTGCA AGTGGCATAT TTTTATCTGC AAACAAACCA AGAACTTTAT 
CTTTATCAGA TTTTGATAAC TCCTTACTGT TCGTGGCAAG GTCGTAACCA TCTTTTTTGA 
ATTTTTTATC TAAATTGCCA TTACTTTTAC CGAAATATTT AGCGCCGCCG CCTAATAAAA 
CATCAACTTT ATGCTTTCCG TTGATTTTAT CTTTATAAAA TTGTTTAGCG ATTTCGTTTT 
TATCATCTCT AGAAGTCACG TGTGCAGCAT ATGCTGCTGG TGTTGCATCT GTTAATTCAG 
CTGTTGAAAC AAGACCAGTC GACTTACCTT TTTCTTTTGC ACGTTCAAGC ACCGTCTTTA 
CTTTCTGCTT GTTACTGTCA ACACCGATGG CACCATTATA TGTCTTATGA CCAGAACTAA 
20 AGGCTGTTCC GCCAGCTGCA GAATCAGTAA TATTCTGTTT TGGGTCATTT GAATATGTAC 

GATTTGTGCC TTTTAAATAT GAATCAAAAG CAGTAGGGGT CATTTCTTTA GCATGCGGAT 
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CATTTTTATA ATAACGATAA GCTGTGTTAA ATGATGGACC CATGCCATCG CCAACTAAAA 
AGATAACATT TTTTGGATTT TTAGTATTAC CAACCGCGAA ACTTTCATCT TTAGAACTTT 
TATCGGATTG CGCAATTGCA GGTGTGACAG AACTAAAAAC CGTTGACACG ATAATAAGGT 
TAGCAACTGC AAATTTTGTG GCTTTTTTAA CTGATAACAT AAGACATCCT CCTGAGTATA 
TGACTATGTC TTCAGTGTAA AAGAGGAATT TtGAGCAATT ATGTAGTTTT AGTTAnAAAT 
ATGTAAACAG AGTGATTTAG AATAACAAAA aATGAATATA TATGACAATT TGTTATAGAA 
AGCGTTAGAA TAGAAGCGTG TGAAAATATA GAATTAAATA TAATTTGAGG TGGAAAAATG 
ATACTAGTAA TGTTATCTCC ATTATTAATC ATATTCTTTA TAGTGTTGTC TATTTTAGAA 
GAGCGTAAAC GTACGAAGAA AAAGCAACTC GAGAAAGAAA AAGCAAATAC ACTAAATCAA 
AATACAAATG ACACGGAAAG TTCAAATCAA GAGCCGTCAT TGCAGCAGGA TAAAGAACAA 
AAAGATAACA AAGGATAATT CAATTGAAGG AAGAAGATTA TAGATGAAAA TATTAATTGT 
TGAAGATGAT TTTGTTATAG CAGAGAGTTT AGCATCTGAA CTTAAAAAAT GGAATTACGG 
TGTTATTGTC GTTGAACAAT TTGATGATAT ACTGTCTATC TTTAACCAAA ATCAACCTCA 
GCTTGTATTG CTAGATATTA ATTTGCCAAC GTTAAATGGT TTTCATTGGT GTCAAGAAAT 
CCGAAAAACA TCTAATGTGC CAATTATATT TATTAGTTCC CGTATTGATA ATATGGACCA 
AATTATGGCA ATACAAATGG GGGGAGATGA TTTTATCGAA AAGCCATTTA ACTTGTCATT 
AACGATTGCC AAAATTCAAG CATTATTGAG ACGAACTTAT GACTTGTCAG TAGCTAATGA 
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ACAAAACATA CAGCTA7CTT TGACTGAATT ACAAATATTA AAGTTATTAT TTCAAAATGA 
AGaTAAATAT GTAAGTAGrA CTGCTTTAAT TGaAAAATGT TGGGaATCAG AAAACtTCAT 
AGATGATAAC ACATTAGCTG TTAACATGAC GCGCCTGCTG AAAAAATTAA ATACTATTGG 
CGTTAATGAT TTTATCATTA CAAAGAAAAA TGTCGGATAT AAAGTATAGG GTGAATGCAA 
TGACCTTTCT TAAAAGTATT ACTCAGGAAA TAGCAATAGT CATAGTTATT TTTG CTTT GT 
TTGGCTTAAT GTTTTACCTG TATCATTTGC CATTAGAAGC ATATTTACTA GCACTTGGCG 
TTATTTTATT ATTATTACTC ATATTCATAG GTATTAAATA TTTAAGTTTT GTAAAAACTA 
TAAGCCAACA ACAACAAATT GAAAACTTAG AAAATGCGTT GTATCAGCTT AAAAATGAAC 
AAATTGAATA TAAAAATGAT GTAGAGAGCT ACTTTTTAAC ATGGGTACAT CAAATGAAAA 
CACCCATTAC TGCAGCACAA CTGTTACTTG AAAGAGATGA GCCTAATGTT GTTAATCGTG 
TTCGTCAAGA GGTTATTCAA ATTGaTAACT ATACAAGTTT AGCACTTAGT TATTTAAAGT 
TATTAAATGA AACTTCTGaT ATTTCTGTCA CTAAAATTTC GATTAATAAT ATCATTCGCC 
CAATTATTAT GAAATATTCA ATACAGTTTA TTGATCAAAA AACAAAAATC CATTATGAAC 
CTTGTCATCA CGAAGTATTA ACTGACGTTA GATGGACCTC TTTAATGATA GAACAATTAA 
TAAATAATGC ACTTAAGTAT GCGAGAGGTA AAGATATATG GATTGAATTT GATGAGCAAT 
CCAATCAATT ACACGTAAAA GATAATGGTA TCGGTATTAG TGAAGCGrAC TTGCCTAAAA 
TATTTGATAA GGGCTATTCA GGTTATAATG GCCAGCGCCA AAGTAACTCA AGTGGGaTTG 
GTTTATTTAT CGTAAAACAA ATTTCAACAC ACACAAACCA TCCTGTTTCA GTCGTATCTA 
AACAAAATGA GGGTACAACA TTTACGATTC AATTTCCAGA TGAATAAAAA CTTTCAATAT 
TGTAAGTATA CTAGTAACAT TTTTTTACTA ATTTAAATGT TATTAGTATT TTTTTGTTTT 
AATATAGAAC TAACAAAGAA ATGAGGTGCA TGCCATGTTG CTAGAAGTGn AACATGTAAA 
AAAGGTTTAT GGTAAAGGTT TGAATGCTAC GACAGCACTT AATCAAATGA ATTTATCAGT 
TGGAGCTGGT GaATTTGTTG CaATTATGGG TGAGTCTGGG tCAGGGAAGT CTACACTACT 
AAATTTAATT GCtTCTTTTG ATGGACTAAC TGAAGGTGAC ATTATTGTGG ATGGCGCACA 
TTTAAATAAT ATGAAAAATA AAAGTAAAGC ATTGTATCGT CaACAAATGG TAGGTTTTGT 
TTTTcAAGAT TTTAATCTTT TACCAACAAT GACGAATAAA GAAAATATAA TGATGCCATT 
AATTTTAGCT GGTGCTAAAC GAAAAGATAT AGAACAAAGG GTACATCAGT TGGCAGTACA 
ATTACATTTA GAGGGATTCT TAAACAAGTA TCCTTCTGAA ATCTCTGGGG GTCAGAAGCA 
ACGCATTGCC ATTGCACGTG CATTAGTTAC TAAGCCGACG ATTTTACTAG CCGATGAACC 
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TCAATTGGAA CAGACAATTT TAATGGTAAC TCATTCAAAT ATCGATGCGT CTTATGCAGA 
GCGAGTCATT TTTATTAAAG ATGGGCGTCT ATATCATGAA ATATATCGTG GTGAAGAAAG 
TCAATTAGCT TTTCAACAAC GAATAACAGA TAGCTTAGCA CTTGTGAATG GAGGAAGTGT 
CAATATATGA AGTTAAGATT GTTATGnACA TAGTGCGACG TCAATTTATT ACGCAGCGAC 
TTGTAATCAT TCCATTCATT TTAGCGGTAA GTGTACTATT CATGATTGAA TATACGCTTG 
TGTCAATTGG GTTAAATAGC TACATAAAAC AGAAGAATGA CTTCCTAGTA CCATTTATTA 
TCATAGCTAA TTTTTTTATG GCGCTTTTAA CTTTTATTTT TATTTTCTAT GCAAATCACT 
TTATGATGTC ACAAAGACGA AAAGAGTTTA GCATTTTTAT GACATTGGGC ATGACCAAGA 
AAAGTATGCG TTTAATTGTA GTGATGGAAA CTATCTTACA ATTTGTGATA ATTTCAGTCG 
TTAGTATTGC CGGCGGATAC TTACTTGGTG CGATATTTTT CTTGTTTATA CAGAAAATAA 
TGGGCAGTGA AGTTGCGACG TTAAGGTATT ATCCATTTGA CTCTGTAGCG ATGTTTATTA 
CTTTGATTAT CATTGCTGTA TTAATGGGCA TGCTACTTAT ATTCAACTTG TTTAGTATTA 
ATTTTCAACG GCCGATAACT TATCAACATC GTTCCGATTC TAGTGTCATA TCACGATGGT 
TGCGTTACGT TTTAATTGTT ATAGGAAGCG CAnACTATAT TTAGGTTACT TTATTGCATT 
ACAACAAGAT ACGACGTTTG GTGCCTTTTT TAAAATATGG ATTGTCATAG GATTAGTTAT 
TATCGGTACT TATGCATTTT TTGTAGGTAT AAGTGAAATA ATTATTAGTA TATTGCAGCA 
GGTATCAAAA GTTTACTATC ATCCACGGTA TTTTTTTGTG GTAGTTGGGA TGCGTGTACG 
TCTTAAAATG AATGCAGTCA GTCTTGCAAC AATCACTTTG CTGTGTACAT TTTTGATTGT 
AACGCTCACA ATGACATTAA CAACCTATCG TGATATGAAT CATACCATTA CGAAATTGAT 
TACGAATGAT TakGATTTGT CATTTAGCGA CAATTCTAAG TCACAAaTAG AACGTCAACA 
AACAATTGAG 

(2) I NFORMAT IOK FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 983 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: double 
(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
CGACATAACG AGGCAAGGGT ACATGATACT TTAGCCTCGT TTTTGATATG TATTTTTCTG 
AATATAAGGG CAATAGATGG TATTTTATAw TTTTTTTAAG GTAGTGATTA ACATAGATAT 
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TCAAGCGGAA CAGCATTATG CACCAGTATT AACGCATTTT TTAGATCCAA GAGGGCAATA 24 0 

TATATTGGAA GTGATTTGTG GCAGTTATGA AGATTTAAAC GTATCTTTTT ATGGTGGACC 3 00 

TAATGCTGAA AGAAAAAGAG CAATCATTTC GCCGAACTAT TATGAACCTA AAGAAAGCGA 360 

CTTTGAATTA ACTTTAATGG AAATAGATTA TCCTGAAAAA TTCGTCACTT TAAAACATCA 420 

ACATATTTTA GGGACATTAA TGTCTTTAGG TATCGAACGC GAACAAGTTG GAGATATAAT 4 80 

TGTGaATGAA CGAATTCAAT TTGTTTTGAC AAGTAGATTG GAATCATTTA TTATGTTAGA 540 

ATTACAACGT ATTAAAGGCG CATCAGTTAA ACTTTATACT ATTCCAGTAA CAGATATGAT 600 

ACAATCTAAT GAGAATTGGA AAAATGAAAG TGCaCAGTTA GTTCTTTAAG GTTAGATGTT 660 

GTTATTAAAG AAATGATACG TAAATCACGT ACGATTGCGA AACAACTAAT CGAAAAAAAA 720 

CGTGTTAAAG TGAATCACAC TATTGTTGAT TCAGCAGATT TTCAATTACA AGCAAATGAT 730 

20 TTAATATCCA TCCAAGGTTT TGGTAGAGCA CACATTACTG ACTTAGGTGG TAAAACTAAA 840 

AAAGATAAAA CGCACATTAC CTATAGAACA TTATTCAAAT AGTAATGATT TAAGGAGGAT 900 

AACAAATGCC TTTTACACCA AATGAaATTA AGAATAAAGA GTTTTCACGT GTaAAGAATG 960 

25 GTTTTAGAAC CTACTGnAGT TGG 983 
(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 10322 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
TTTTGCAAAG CTTATTTTAT GTCAAACAGA TAGTCAATGT GAAACAAAGG TTAGTACATA 60 

40 TAATCATCCA GACTTTATGT ATATATCAAC AACTGAGAAT GCAATTAAGA AAGAACAAGT 120 

TGAACAACTT GTGCGTCATA TGAATCAACT TCCTATAGAA AGCACAAATA AAGTGTACAT 180 

CATTGAAGAC TTTGAAAAGT TAACTGTTCA AGGGGAAAAC AGTATCTTGA AATTTCTTGA 240 

AGAACCACCG GACAATACGA TTGCTATTTT ATTGTCTACA AAACCTGAGC AAATTTTAGA 300 

CACAATCCAT TCAAGGTGTC AGCATGTATA TTTCAAGCCT ATTGATAAAG AAAAGTTTAT 360 

AAATAGATTA GTTGAACAAA ACATGTCTAA GCCAGTAGCT GAAATGATTA GTACTTATAC 42 0 

TACGCAAATA GATAATGCAA TGGCTTTAAA TGAAGAATTT GATTTATTAG CATTAAGGAA 4 80 

ATCAGTTATA CGTTGGTGTG AATTGTTGCT TACTAATAAG CCAATGGCAC TTATAGGTAT 54 0 

55 



45 



SO 



564 



EP0 786 519 A2 

GAATGGTTTC TTCGAAGATA TCATACATAC AAAGGTAAAT GTAGAGGATA AACAAATATA 6 50 

TAGTGATTTA AAAAATGATA TTGATCAATA TGCGCAAAAG TTGTCGTTTA ATCAATTAAT 720 

5 TTTGATGTTT GATCAACTGA CGCAAGCACA TAAGAAATTG AmTCAAAATG TAAATCCAAC 780 

GCTTGTATTT GAACAAATCG TAATTAAGGG TGTGAGTTAG ATGCCAAATG TAATAGGTGT 840 

TCAGTTTCAA AAAGCGGGAA AATTAGAATA TTATACACCT AATGATATAC AAGTAGATAT 900 

10 

AGAAGACTGG GTAGTTGTCG AATCTAAAAG AGGCATAGAG ATAGGTATTG TTAAAAATCC 960 

ATTAATGGAT ATTGCTGAAG AGGATGTTGT GTTACCTCTT AAAAATATTA TTCGCATTGC 1020 

15 TGATGACAAA GATATTGATA AATTTAATTG TAATGAACGA GATGCTGAAA ATGCATTAAT 10 80 

ACTATGTAAA GACATTGTAA GAGAACAAGG TTTGGACATG CGTTTAGTCA ATTGCGAATA 114 0 

TACATTAGAT AAATCGAAAG TTATTTTTAA TTTTACGGCG GATGATCGTA TTGATTTTAG 1200 

20 AAAATTAGTA AAAATATTAG CGCAACATTT AAAAACACGT ATCGAGTTGA GACAAATTGG 1260 

TGTAAGGGAT GAAGCCAAAT TGCTTGGCGG TATCGGACCT TGTGGTAGGT CGTTATGTTG 1320 

TTCTACATTT TTAGGGGATT TTGAACCAGT ATCGATTAAG ATGGCTAAGG ATCAAAATTT 13 80 

25 

ATCATTAAAT CCAACTAAAA TTTCTGGTGC ATGTGGTCGT TTGATGTGTT GTTTAAAATA 144 0 

TGAAAATGAC TATTATGAGG AAGTACGTGC ACAATTACCT GATATTGGTG AAGCAATTGA 1500 

AACGCCTGAT GGTAACGGGA AAGTAGTTGC TTTAAATATA TTAGACATTT CTATGCAGGT 1560 

30 

GAAGCTTGAG GGACATGAAC AGCCACTTGA ATATAAATTA GAAGAAATAG AAACTATGCA 1620 

TTAAGGAGGC ATT ATT A CAT TTGGATCGCA ATGAAATATT TGAAAAAATA ATGCGTTTAG 1680 

35 AAATGAATGT CAATCAACTT TCAAAGGAAA CTTCAGAATT AAAGGCACTT GCAGTTGAAT 174 0 

TAGTAGAAGA AAATGTAGCG CTTCAACTTG AAAATGATAA TTTGAAAAAG GTGTTGGGCA 1800 

ATGATGAACC AACTACTATT GATACTGCGA ATTCAAAACC AGCAAAAGCT GTGAAAAAGC 1860 

40 CATTACCAAG TAAAGATAAT TTGGCTATAT TGTATGGAGA AGGATTTCAT ATTTGTAAAG 1920 

GCGAATTATT TGGAAAACAT CGACATGGTG AAGATTGTCT GTTCTGTTTA GAAGTTTTAA 1980 

GTGATTAATC AAGCACACTC AAATAGTGTT ATAATTATAA ATGAATATGG TTTGGATAAG 2040 

45 

TCTGAGACAA TGCATGTTTC AGGCTTTAAT TGTGTATAAA GTTTTGGTGA TTGCATAAGA 2100 

GATGGCGGTA CTAAATGTTA TTATTAAGTG TGCACGCAgT ATCaTTAGTT ATAAAATGTA 2160 

GCTGTTAAAA GTCAAAAATA CATCGAATGT AGTTAGGCAT ATAATATAAA AAGAGTTTTC 2220 

SO 

AATTACTCAA TAGAAAAAGG TTGTCTTCAT AGGAGTTAAA AATGTTAAAA GAGAATGAAC 2280 

GATTTGATCA ACTAATCAAA GAAGATTTTA GTATTATTCA AAATGATGAT GTTTTTTCAT 2340 
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TGGACTTATG TTCAGGCAAT GGGGTGATAC CCTTGTTATT GTTTGCGAAA CATCCACGAC 24 SO 

ATATAGAAGG TGTTGAGATT CAAAAAACAC TTGTCGATAT GGCGCGACGC ACATTTCAAT 2520 

TCAATGATGT TGATGAATAT TTAACAATGC ATCACATGGA TTTGAAAAAC GTTACTAAAG 2580 

TATTTAAACC TTCACAATAT ACTTTAGTAA CGTGTAATCC GCCTTATTTT AAAGAGAATC 2640 

AGCAACACCA ACATCAAAAA GAAGCACATA AGATAGCGAG ACATGAGATT ATGTGTACAC 2700 

TTGAAGATTG CATGATTGCA GCCCGTCATT TATTAAAAGA AGGTGGCAGG CTAAACATGG 2760 

TACATCGTGC AGAGAGACTA ATGGATGTCT TGTTTGAAAT GAGAAAAGTG AATATTGAAC 2820 

15 CTAAGAAAGT CGTTTTTATA TATAGTAAAG TAGGGAAATC AGCACAAACG ATAGTAGTAG 2880 

AAGGTCGAAA AGGTGGAAAT CAAGGTTTAG AAATCATGCC CCCATTTTAT ATTTATAATG 2940 

AAGATGGTAA TTATAGCGAA GAAATGAAGG AAGTATATTA TGGATAGTCA TTTTGTATAT 3000 

20 ATTGTAAAAT GTAGTGATGG AAGTTTATAT ACAGGATACG CTAAAGACGT TAATGCACGT 3060 

GTTGAAAAAC ATAACCGAGG TCAAGGAGCC AAATATACGA AAGTAAGACG TCCGGTGCAT 3120 

TTAGTTTATC AAGAAATGTA TGAGACAAAG TCTGAAGCAT TGAAGCGTGA ATATGAAATT 3180 

AAAACTTATA CCAGACAAAA GAAATTGCGA TTAATTAAGG AGCGATAGTA TGGCTGTATT 3240 

ATATTTAGTG GGCACACCAA TTGGTAATTT AGCAGATATT ACTTATAGAG CAGTTGATGT 33 00 

ATTGAAACGT GTTGATATGA TTGCTTGTGA AGACACTAGA GTAACTAGTA AACTGTGTAA 3360 

TCATTATGAT ATTCCAACTC CATTAAAGTC ATATCACGAA CATAACAAGG ATAAGCAGAC 3 420 

TGCTTTTATC ATTGAACAGT TAGAATTAGG TCTTGACGTT GCGCTCGTAT CTGATGCTGG 3480 

35 ATTGCCCTTA ATTAGTGATC CTGGATACGA ATTAGTAGTG GCAGCCaGAG AAGCTAATAT 3 540 

TAAAGTAGAG ACTGTGCCTG GACCTAATGC TGGGCTGACG GCTTTGATGG CTAGTGGATT 3600 

ACCTTCATAT GTATATACAT TTTTAGGATT TTTGCCACGA AAAGAGAAAG AAAAAAGTGC 3660 

40 TGTATTAGAG CAACGTATGC ATGAAAATAG CACATTAATT ATATACGAAT CACCGCATCG 3720 

TGTGACAGAT ACATTAAAAA CAATTGCAAA GATAGATGCA ACACGACAAG TATCACTAGG 3780 

GCGTGAATTA ACTAAGAAGT TCGAACAAAT TGTAACTGAT GATGTAACAC AATTACAAGC 3840 

ATTGATTCAG CAAGGCGATG TACCATTGAA AGGCGAATTC GTTATCTTAA TTGAAGGTGC 3900 

TAAAGCGAAC AATGAGATAT CGTGGTTTGA TGATTTATCT ATCAATGAGC ATGTTGATCA 3960 

TTATATTCAA ACTTCACAGA TGAAACCAAA ACAAGCTATT AAAAAAGTTG CTGAAGAACG 4020 

ACAACTTAAA ACGAATGAAG TATATAATAT TTATCATCAA ATAAGTTAAT CACTTTATCG 4 080 

ATTaTATGAA ATTTTAAACG ATTTTATAAA CGCAAGCTGT AATTTTAAAT GGTAAGTTAT 4140 
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GTTTTTTAAT GTAAAATAAA TACATTGAAA GTAATAAATA CCTTAACATT GAATAAGATG 4 260 

AAAATGAGAT GACGAGATAA ATGTTCGCGT CCGTTGAAAT GCATAGAAAT CTTAGATATT 4 320 

ATTTGAAGTG AGACATTACG AGGAGGAACA GTTATGGCTA AAGAAACATT TTATATAACA 4380 

ACCCCAATAT ACTATCCTAG TGGGAATTTA CATATAGGAC ATGCATATTC TACAGTGGCT 4440 

GGAGATGTTA TTGCAAGATA TAAGAGAATG CAAGGATATG ATGTTCGCTA TTTGACTGGA 4 500 

ACGGATGAAC ACGGTCAAAA AATTCAAGAA AAAGCTCAAA AAGCTGGTAA GACAGAAATT 4560 

GAATATTTGG ATGAGATGAT TGCTGGAATT AAACAATTGT GGGCTAAGCT TGAAATTTCA 4620 

1$ AATGATGATT TTATCAGAAC AACTGAAGAA CGTCATAAAC ATGTCGTTGA GCAAGTGTTT 4680 

GAACGTTTAT TAAAGCAAGG TGATATCTAT TTAGGTGAAT ATGAAGGTTG GTATTCTGTT 474 0 

CCGGATGAAA CATACTATAC AGAGTCACAA TTAGTAGACC CACAATACGA AAACGGTAAA 4800 

20 ATTATTGGTG GCAAAAGTCC AGATTCTGGA CACGAAGTTG AACTAGTTAA AGAAGAAAGT 4 860 

TATTTCTTTA ATATTAGTAA ATATACAGAC CGTTTATTAG AGTTCTATGA CCAAAATCCA 4920 

GATTTTATAC AACCACCATC AAGAAAAAAT GAAATGATTA ACAACTTCAT TAAACCAGGA 4 980 

CTTGCTGATT TAGCTGTTTC TCGTACATCA TTTAACTGGG GTGTCCATGT TCCGTCTAAT 5040 

CCAAAACATG TTGTTTATGT TTGGATTGAT GCGTTAGTTA ACTATATTTC AGCATTAGGC 5100 

TATTTATCAG ATGATGAGTC ACTATTTAAC AAATACTGGC CAGCAGATAT TCATTTAATG 5160 

GCTAAGGAAA TTGTGCGATT CCACTCAATT ATTTGGCCTA TTTTATTGA7 GGCATTAGAC 5220 

TTACCGTTAC CTAAAAAAGT CTTTGCACAT GGTTGGATTT TGATGAAAGA TGGAAAAATG 5280 

35 . AGTAAATCTA AAGGTAATGT CGTAGACCCT AATATTTTAA TTGATCGCTA TGGTTTAGAT 5340 

GCTACACGTT ATTATCTAAT GCGTGAATTA CCATTTGGTT CAGATGGCGT ATTTACACCT 5400 

GAAGCATTTG TTGAGCGTAC AAATTTCGAT CTAGCAAATG ACTTAGGTAA CTT AGT AAA C 5460 

40 CGTACGATTT CTATGGTTAA TAAGTACTTT GATGGCGAAT TACCAGCGTA TCAAGGTCCA 5520 

CTTCATGAAT TAGATGAAGA AATGGAAGCT ATGGCTTTAG AAACAGTGAA AAGCTACACT 5580 

GAAAGCATGG AAAGTTTGCA ATTTTCTGTG GCATTATCTA CGGTATGGAA GTTTATTAGT 564 0 

AGAACGAATA AGTATATTGA CGAAACAACG CCTTGGGTAT TAGCTAAGGA CGATAGCCAA 5700 

AAAGATATGT TAGGCAATGT AATGGCTCAC TTAGTTGAAA ATATTCGTTA TGCAGCTGTA 5760 

TTATTACGTC CATTCTTAAC ACATGCGCCG AAAGAGATTT TTGAACAATT GAACATTAAC 5820 

AATCCTCAAT TTATGGAATT TAGTAGTTTA GAGCAATATG GTGTGCTTAA TGAGTCAATT 5880 
ATGGTTACTG GGCAACCTAA ACCTATTTTC CCAAGATTGG ATAGCGAcGG AnAATTGCAT . 5940 
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AACCTCAAAT TGATATTAAA GACTTTGATA 
ATGCTGAACA TGTTAAGAAG TCAGATAAGC 
5 AACAAAGACA AATTGTATCA GGAATTGCCA 

AAAAAGTAGC AGTTGTTACT AACCTGAAAC 
GTATGATATT ATCTGCTGAA AAAGATGGTG 

10 

TTCCAAATGG TGCAGTGATT AAATAACTGT 
TAATCGATAC ACATGTCCAT TTAAATGATG 

15 TTACACGTGc TAGAGAAGCA GGTGTTGATC 

CAATTGAACG CGCGATGAAA TTAATCGATG 
GGCATCCAGT TGACGCAATT GATTTTACAG 

20 CTCAGCATCC AAAAGTGATT GGTATTGGTG 

CTCCTGCAGA TGTTCAAAAG GAAGTTTTTA 
AGT7ACCAAT TATCATTCAT AACCGTGAAG 

25 AGGAGCATGC TGAAGAGGTA GGCGGGATTA 

CAGATATTGT AACTAATAAG CTGAATTTTT 
AAAATGCTAA ACAGCCTAAA GAAGTTGCTA 

30 

AAACCGATGC ACCGTATCTT TCGCCACATC 
GAGTAACTTT AGTAGCTGAA CAAATTGCTG 
35 GCGAACAAAC AACTAAAAAT GCAGAGAAAT 

GAGAAAGATC ACCGCCATAA ATGTAAACGA 
TTCTCACTTT TTTAAATTAA AATATCGTGC 

m 

40 AGCTTTGAAA TTAAGAATTG TAGGAAGGCG 

GTAGAAGGAC GAGATGATAC TGAGCGTGTT 
ACGAATGGTA GTGCCATCAA CGAACAAACT 

45 

CGAGGCGTTA TTGTATTAAC AGATCCAGAT 
ACTGAACATG TCAAAGGTGT TAAACATGCG 
AAAGGGAAAA TTGGTGTTGA ACATGCCGAC 

50 

GTTAGTTCAC CCTTTGATGA AGCTTATGAA 
GGGTTAATTG TTGGGAAAGA TGCAAGGCGC 

55 



AAGTTGAAAT TAAGGCAGCA ACGATTATTG 6060 

TTTTAAAAAT TCAAGTAGAC TTAGATTCTG 6120 

AATTCTATAC ACCAGATGAT ATTATTGGTA 6180 

CAGCTAAATT AATGGGACAA AAATCTGAAG 6240 

TATTAACCTT AGTAAGTTTA CCAAGTGCAA 6300 

ATTTTTAAAA ATTAGGAGAG ATAATTATGT 6360 

AGCAATACGA TGATGATTTG AGTGAAGTGA 6420 

GTATGTTTGT AGTTGGTTTT AACAAATCGA 64 80 

AGTATGATTT TTTATATGGC ATTATCGGTT 6540 

AAGAACACTT GGAATGGATT GAATCTTTAG 6600 

AAATGGGATT . AG ATTATCAC TGGG AT AAAT 6660 

GAAAGCAAAT TGCTTTAGCT AAGCGTTTGA 6720 

CAACTCAAGA CTGTATCGAT ATCTTATTGG 6 780 

TGCATAGCTT TAGTGGTTCT CCAGAAATTG 6 840 

ATATTTCATT AGGTGGACCT GTGACATTTA 6900 

AGCATGTGTC AATGGAGCGT TTGCTAGTTG 6 960 

CGTATAGAGG GAAGCGAAAT GAACCGGCGA 7020 

AATTAAAAGG CTTATCTTAT GAAGAAGTGT 7080 

TGTTTAATTT AAATTCATAA AGTTAAAAGT 7140 

TGCTATATTC GTTTAATATG CTATGGTTCT 7200 

ATGTGGAATA CGTGCGATAG AGATGGTTAG 7260 

TTTTAAATGA AAATCAATGA GTTTATAGTT 7320 

AAACGAGCTG TTGAATGTGA TACGATTGAA 7380 

TTAGAAGTAA TTAGAAATGC TCAACAAAGT 7440 

TTCCCAGGAG ATAAAATTAG AAGTACAATT 7500 

TATATTGATA GAGAAAAAGC TAAAAATAAA 7560 

TTAATTGATA TTAAAGAAGC GTTAATGCAT 7620 

TCAATTGATA AATCTGTGCT AATAGAGTTG 76 BO 

CGTAGAGAAA TTTTAAGTAG AAAATTGCGA 774 0 
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GCGGATGTAA GG CAAGCTTT AGAAGATGAA 
TATTGCAACA CCATCAAGAA CGCGAGCGTT 

5 

AAGTTTAGGA CAGAACTTTT TGATAGATGT 
TGATATTGAT GCACAAACTG GGGTGATTGA 
ACAATTGGCC AGACATGCTA AAAGAGTATT 

10 

TGTATTAAAT GATACACTAT CACCTTATGA 
AAAAGCGAAT ATTAAAGAAG CTGTTGAAAA 

15 TGTTGCAAAC CTGCCGTACT ATATTACGAC 
TATACCAATT GATGGCTACG TGGTGATGAT 
TGAAGTAGGT TCAAAAGCAT ATGGTTCGTT 

20 TAGTAAAGTA TTAACGGTAC CTAAATCTGT 

AGTTGTAAAA CTGATGCAGA GAACTGAACC 
CTTTAAGTTA GCAAAAGCAG CATTTGCACA 

25 

AAATTATTTT AAAGATGGTA AACAACACAA 
AGGTATTGAT CCAAGACGTC GCGGTGAAAC 
TGAAGAAAAG AAAAAATTCC CTCAATTAGA 

30 

TTGTTAAAAT TTAAATTTTG TTTGACGAAA 
AGCGAGGTGG AGCAATATGC CAAAATCAAT 

35 TGTAGGAAAT CGTATTGTAC TGAAaGCCAA 

TGGAATTTTA AAAGAAACAT ATCCGTCAGT 
CAACJTTGAG AGAGTATCTT ATACATACAC 

40 ATTTGAAGAG GATAATCATC ACGAATCAAT 
AGACGTTTCT TAGTATAAGA AGTAAATATT 
TTCAATACTC TTTTTATTTA CAAAATGTTT 

45 

GTAAATGGAT AATTGTATTT ATAAACACAA 
GAAAATTACT TTTTTATTAA AAAAACACTA 
CTCTTTATGT TAAAATCATC ATATTAAGAT 

50 

AACGGCACCA GCCAAAATTA ATTTTACGCT 
TCATGAGATT GAAATGATAA TGACAACAGT 

55 



TGAGGAAGTG AAAATGTTGG ATAATAAAGA 7860 

GTTAGATAAA TATGGCTTTA ATTTTAAAAA 7 920 

GAATATCATT AATAATATCA TTGATGCAAG 7980 

AATTGGTCCA GGCATGGGGT CATTGACAGA 8040 

GGCATTTGAA ATTGATCAAC. GTTTAATACC 8100 

TAATGTGACG GTGATTAATG AAGATATTTT 8160 

TCATTTACAA GATTGTGAAA AAATAATGGT 8220 

GCCAATTTTA TTAAATTTGA TGCAACAAGA 8280 

GCAAAAAGAA GTGGGCGAAC GCTTAAATGC 8340 

ATCAATTGTC GTACAATACT ATACAGAGAC 8400 

ATTTATGCCA CCACCTAATG TTGATTCAAT 84 60 

GTTAGTAACA GTAGATAACG AGGAAGCATT 8520 

AAGAAGAAAG ACAATTAACA ATAACTATCA 8580 

AGAAGTGATT TTACAATGGT TGGAACAAGC 8640 

GCTATCTATT CAAGATTTTG CTAAATTGTA 8700 

AAATTAAATG ATTGACAAAG CAAAGCACTA 8760 

ACGTTGCAAA TATGGTATTA TGTAACTTGT 8 820 

TTTGGACATC AAAAATTCTA TTGATTGTCA 8830 

TGGAGGCCGT AAGAaAACAA TAAAACGTTC 894 0 

TTTCATTGTT GAGTTAGATC AAGACAAACA 9000 

TGATGTGTTA ACTGaAAATG TTCAAGTTTC 9050 

TGCACACTAA ATAAGACATA TAGAGATGTT 9120 

ATGATAATTA TTTGAGTGTT GGGcATTATG 9190 

AACACTGATG TTTCGCTTAT AGATTTTTCA 9240 

ATACAAGTAA ATACTAAGTA ATTAGATGGA 9300 

AAAAACAAAT TAAAATGTCA AATATTAATT 9360 

AACGAAAAGA GGGCGGAAAA TGATATATGA 94 20 

CGATACACTT TTTAAAAGAA ATGATGGCTA 9430 

TGATTTAAAT GATCGTTTAA CTTTTCATAA 9540 
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AAATCTCGCA TATCGTGCAG CGCAACTATT TATTGAGCAA TATCAACTAA AGCAAGGTGT 9660 

AACAATTTCT ATCGATAAAG AAATACCTGT TTCTGCTGGC TTAGCTGGAG GTTCGGCTGA 9720 

5 

TGCAGCAGCA ACGTTAAGAG GATTGAATCG ACTTTTTGAT ATAGGGGCGA GTTTGGAAGA 9780 

ATTGGCTCTA CTAGGCAGTA AAATCGGGAC AGATATTCCG TTTTGTATTT ATAATAAAAC 9840 

TGCACTATGT ACTGGAAGAG GAGAGAAAAT CGAGTTTTTA AATAAACCAC CTTCAGCTTG 9900 

10 

GGTGATTCTT GCTAAACCAA ACTTAGGCAT ATCATCACCA GATATATTTA AGTTGATTAA 9960 

TTTAGATAAG CGTTACGACG TACATACGAA AATGTGTTAT GAGGCCTTAG AAAATCGAGA 10020 

r5 TTATCAACAA TTATGTCAAA GTTTGTCTAA TCGATTAGAG CCAATTTCTG TTTCAAAACA 10080 

CCCACAAATC GATAAATTAA AAAAT AATAT GTTGAAAAGT GGTGCAGATG GTGCGTTAAT 10140 

GAGTGGAAGC GGACCTACT3 TGTATGGGCT AGCACGAAAA GAAAGCCAAG CAAAAAATAT 10200 

20 TTATAATGCA GTTAACGGTT GTTGTAATGA AGTGTACTTA GTTAGACTAT TAGGATAGAA 10260 

GGGTTGAAAA GATGAGATAT AAACGAAGCG AGAGAATTGT TTTTATGACG CAATATTTGA 10320 

TG 10322 

25 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5614 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



35 (xi) SEQUENCE DESCRIPTION: 

GATTGATTAA ATGTTTTAAT CCACTTCAAT 
CATATAATTA TTCGATTTCA TTTGTTCAGC 

* 

40 TTTAAAwGCG AAAATTGAAA TTGGTATCGT 

AATGAGCATT ATGTATAAAA AGATAGCAGC 
AATCATATGT GCTAAAGGTA ATTCTATTGT 

45 

TTCACCTATT TTCTTAGATT CCaCTACGCC 
TTTTTTACGA ATTTCAGATA AAATTTCATA 
TCCAAAACAA CACACTTGTG AAATATAAGC 

SO 

CGAATTAATC GTATATGTAT TGTTAATCAT 
AGGAACTAAT CCAGAAAAGA CACTGATGAT 

55 



!EQ ID NO: 99: 

GCCTTCGATA AACTCTACAA TCGCGCTATT 60 

ATATGTCTCA TTAAATCCAG ACATAACTTT 120 

TACTAATAAG GCACTAGCCA TACGCCAATC 180 

TGACAAAAGT AAGTTTCCTA TAACTTCAGG 240 

TTCAACCTTA TCGACAAATA TATTTTTTAA 300 

TAAAGGGAGA CGCATTAATT TTTGAGCTAA 360 

TGCCGTAATA TGTGATAGCA TCGTTGACGC 420 

GATTAAAGCA ATAAAGATAT AAACCATAAT 4 80 

CATTAAAATA ATTTTAAATA CTGCCCAATA 540 

AGACAACAAA ATTGATAACA TAATTTTCCA 600 
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ATATGTAACT CCTkTCAATT AATAATCTAA 
GATGATATAC ATAATATAAA TTTGTTATTT 

5 

CATATATTAG TTGATAACGA TTATCAATGT 
AATTCACAAG GTTATGGGGC AGAAATGATA 
TTCTGGGAGT GGGACAGAAA TGATATTTTC 

10 

TTGCATTGTC TCTAGAAATT GGGAATCCAA 
TTGTAGAGTC TAGTACATTG ATTTGTATCC 

15 TCTAATGATC CTATGACTCA ACTATTAAAT 

TTAAATTCAT TTATTGTAAT ATTGCAAAAA 
AATTAATTAC ATAATAAATT GAACATCTAA 

20 TGAGGGGATT TATTTAGGTG TTGGTTATTT 
CCAATGTGCA AAAAACGCAA CAAGACAGCC 
TAAATTGAAC ATCCGTCATA CACCTCCTCT 

25 

CGACCATCAT ATTATATCAT TTATTTATTA 
GTTTTTTTTA GTGGTTTACG CTACTTTAAT 
TAAATGTGAT GGGTATCGTA ATAATTAAAC 

30 

TTAGCCAGGA TACAAATACA TATAATAAAA 
TAATTGGAAA ACTAATGAAT TTTCTCCAAG 

35 TCAAATAATA TGAAATCACA AAAGCGACTA 

CAATGTGTAA TAATTTTAAC AGCAATAAAT 
CGCCAACAAT GATAAATTTT AAAATTTCAG 

40 TCCTCAACAT CAAAATATAT GCATAACTAC 
ATCCGCTTCA CTTCAAATAT GCTTATTTCA 
CTCTTTCCCC TCATCCTTAT ACGCCATTAT 

45 

GCACTATAGA GATTACTTTA GTTCACTAGT 
AATGAGAGGA TGTCTACTAT GCAATTACAA 
GAAAGTATGA CCGCACAGCA AGTTGGCAAT 

SO 

GGGAATACCC TTCATTATGA TATCATTCCG 
GCTTTAATGC ATGCAACAGG TGCCACTAAG 

55 



ATTAAGCCGC 


TTATATTATT 


TATTTCACTG 


720 


GTTAAAAATT 


AATACTTATT 


ACAAGTACAT 


730 


CGCGTGGATT 


TGTGACACAT 


TTCTTTTAAA 


840 


AAGAGCCACT AATGATTTAT 


TATGTAGTGG 


900 


ACAAAATTTA 


TTTCGTCGTC 


CCACCCCAAC 


960 


TTTCTCTTTG 


TTGGGTCCCT 


GAATATAGCC 


1020 


CAATGTCCCT 


ATAATTGATT 


ATTCGCTTTA 


1080 


CATTTTTCGA AATACTTAAT 


TCTAATATAA 


1140 


TACATTGCAC ACCTTGTTCA TCAATGCTAT 


1200 


ATACACCAAA 


TCCCCTCACT 


ACTGCCATAG 


1260 


GTCACCTTTT 


TTATTGTTGC 


GCGTTCGTAA 


1320 


GCTTATAGCT 


GAAGTCATGA 


TGTTAATTAA 


1380 


CTGCGTTAAA 


GTAACGCCCG 


AGATGTTAGG 


1440 


TATTTCACGC 


AATATTAAGG 


CT7AAGTAAA 


1500 


TGCTATCTTT 


TAAAATCCAT 


TTAGATAATA 


1S60 


CAGCAAATGG 


TGCAATTTCT 


GCTGGCAAAT 


1620 


CTGTTTGTAA 


GCTTACGTTG 


ACAATCTGCG 


1680 


TAGGTTTTAC 


CCTGTAAACA 


AAATAACAAT 


1740 


GAAATCCGGT 


AATATGACTA 


ATCATATATT 


1800 


AGACAACATA 


ATAATTTAAC 


GTATTAATGC 


1860 


CATGCGTTTG 


TGTTAGTTTC ATATGTGTAc 


1920 


GTTCTCGAAC 


ATACTCGAAT 


ATGCGAGCCA 


1980 


ATCTTTATAC 


CCTTTCACAG 


CAAATTTAGT 


2040 


AATGTAACTG 


ATTTATCGCG 


TGACTCATTA 


2100 


AATriTATAT 


ACAATAAGAG 


CGACAACAGT 


2160 


AAAATTGTCA 


TCGCTCCTGA 


CTCATTTAAG 


2220 


ATTATAAAAC 


AGGCTTTTAC 


TAATGTTTAT 


2280 


ATGGCTGATG 


GTGGTGAAGG 


TACCACAGAT 


2340 


TATACAGTCA 


TCGTTAATGA 


CCCTTTAATG 


2400 
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GCGGCAGCGT CAGGTTTGGA TTTATTAGAA 
TCATATGGTA CCGGTGAACT AATTAAAGAT 
5 TTAGGGATTG GTGGCAGTGC AACAAATGAT 

GTAAAGTTTA CTGATGTAAA CGGGGACTTA 
ATTGCACAAA TCGATATAAC CAATCTAGAT 

10 

GCCTGTGATG TTTCAAATCC TTTATTGGGT 
CAAAAAGGCG CTGATGCAAA GATGATACCA 
GATAAGATAA AAATGTGCAC AGGAAAGTCC 

15 

GGCGGTATGG GCGCAGCATT ATTAGCGTTT 
GTCGTCTTTG ACATTACAGA TTTTCATCAA 

20 GGAGAAGGAC GCATGGATTA TCAGACCATC 

GCTGCAAAAC AATATCATAT TCCTGTCATC 
CAACATGTTT ACGATTTCGG TATTGATAGT 

25 TTAGAAGATG TCCTACAAAA TAGCGAACAA 

♦ 

CGTATTCTGA AATTACAATA ATGTCAAAGT 
ACTTGAATGA GGTGAAACCC ATGAAAAGAA 

30 

ACAATCAAAA CCAAAATCAT CGTCGTCAAT 
CTAAAGGCGA TCCTGAAGAA CACCCGGAAC 
AACAAATTCT TGAAGAAGAA AACGAGAAAT 

35 

TTATTGCCAT TCTCTTAATT ATTGTCGCTA 
ATAGtGATAA AGTTAGTAAT GACCCTAAAG 
ATCAAGACGG CCAAATTAAC CAGCAAGTAG 

40 

AAAAAACTGA TGACATTATT AAAAATTTAC 
AACAAAACAA AGCTGATTCT AAGCTAACTC 
45 CAGAGGCAAA TAATGCACTT AAAAACAATG 

ATGATATTAA TACAAAATTC GACAGTATTA 
ACAATGGTGG CGCTAATTAA TTATTACACC 

SO 

CTTTATCTGT ATCACTACGT TATTCGTGAT 
TAAACTTGTA TTCTAACTAC ATACAAATAC 

55 
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AAAGAGGAAC GTAATCCTTT ATACACATCA 2520 

GCATTAAATC ATGGTGCTAA GACCATTATT 2580 

GGTGGTACAG GTATGCTAAG TGCACTAGGC 2640 

TTACAAATGA ATGGTGCTAA TCTTGCTCAC 2700 

TCGCGATTAA AAGAGGTGAC CTTTAAAGTG 2760 

GAAAATGGTG CTACCTATAT TTATGGTCCT 2820 

AAGTTGGATT TCGCAATGTC GCATTATCAT 2880 

GTTAATCAAA TACCAGGTTC TGGTGCAGCT 294 0 
TGTGAGACAA CTTTAACAAA AGGTATTGAT ' 3000 

AGAATTAAAG ATGCAGACCT CGTTATTACT 3060 

TTTGGTAAAA CACCCGTAGG CGTTGCGTTA 3120 

GCGATTTGTG GCAGTCTAGG CGAAAATTAT 3180 

GCCTATTCTA TAATCTCTTC ACCTAGCACT 3240 

AATTTATTAA ACACTGCAAC TGACATTGCT 3300 

AAATCATCAG CTTTATTATT TGCAGTTAAA 3360 

CTGATAAATA CCGTGATTCA TATCAATACG 3420 

CTGAAGACGC ATCGTATAGA CAACAATATG 3 4 30 

GATACTATAA TGGTAGAGAT TATCGAAGAG 3540 

CCCGCCGTTC AAAAAAATGG TTATATATCA 3600 

TTTTTGTCAC ACGCGCCTTA CTTAACAATG 3660 

TCTCTCAAAA TTATAAAAAA CAAGTTGAAA 3720 

ATAATGCTAA AGAAAATATT AAAAACAACC 3 780 

AAAATCAAAT CGACAACTTG AAGCAGCAAG 3 34 0 

AATTTTATCA AGAC CAAATC AACAAATTGA 3900 

CAAGCCAAGG TAAAATTGAA AGCATGTTAA 3960 

AATCTAAATT AGAAAGCTTA TTTAAAGATG 4020 

TGCTTTGATG ATAAACATTA ATTCCCTATA 4080 

GATGCATTAA GAGTATAGGG ATTTTTTATA 4140 

ACACAAAACG TATATAATTT ATATAATTAT 4200 



592 



EP0 786 519 A2 

TTATTGCTAA TTACGTTAGG CGTCATGACC GCTTTTGGCC CACTAACTAT AGATATGTAC 4320 

GTACCATCAT TACCTAAAGT GCAAGGTGAT TTTGGTTCTA CTACATCAGA AATTCAATTA 4380 

5 ACATTATCAT TCACAATGAT TGGTCTTGCA CTAGGCCAAT TTATCTTTGG ACCTTTATCC 444 0 

GATGCTTTTG GTCGCAAACG GATTGCTGTA TCCATTTTGA TCATTTTCAT TTTGGTATCA 4500 

GGTTTGTCTA TGTTTGTTGA TCAATTGCCA TTATTCTTAA CTTTACGATT TATTCAAGGT 4560 

10 

TTAACTGGTG GTGGCGTCAT CGTGATTGCA AAAGCCTCTG CTGGTGATAA ATTTAGTGGC 4620 

AACGCACTCG CTAAATTTTT AGCATCTTTA ATGGTAGTTA ATGGCATCAT CACTATTCTT 4680 

GCACCATTAG CCGGTGGATT AGCTTTATCC GTAGCAACAT GGCGTTCTAT TTTCACAATT 474 0 

15 

TTAACTATTG TGGCACTCAT CATTTTAATT GGCGTCGCTT CTCAATTACC TAAAACATCT 4800 

AAAGATGAAT TAAAGCAGGT GAATTTTAGT AGCGTCATTA AAGATTTTGG AAGTCTTTTG 4860 

2Q AAAAAACCAG CATTTATTAT TCCAATGCTA TTACAAGGwT TAACTTATGT AATGCTATTT 492 0 

AGTTATTCAT CTGCATCGCC ATTTATTACT CAAAAATTGT ATAATATGAC ACCCCAACAA 4 980 

TTTAGTATCA TGTTTGCTGT TAACGGTGTA GGTTTAATCA TTGTCAGTCA AGTCGTTGCT 504 0 

25 TTATTAGTAG AAAAATTACA TCGCCACATA TTATTAATCA TTTTAACTAT TATACAAGTG 5100 

GTAGGTGTTG CTTTAATTAT CCTGACACTT ACATTCCATT TACCACTTTG GGTCTTACTC 5160 

ATCGCATTCT TCTTAAATGT GTGTCCTGTG ACGTCAATTG GACCGCTTGG TTTCACAATG 5220 

30 

GCTATGGAAG AACGAACAGG TGGCAGTGGT AACGCATCAA GTTTACTTGG CTTATTCCAA 5280 

TTTATCTTAG GTGGCGCTGT TGCACCATTA GTTGGCTTAA AAGGCGAATT TAATACATCA 5340 

CCATATATGA TTATTATCTT CATTACAGCC ATTCTATTAG TCAGTCTACA AATCATTTAC 5400 

35 

TTTAAAATGA TTAAAAAGCA ACATGTCGCA TAACACTTCA ACATAATTAG AACCCTAGCA 5460 

AAGATATCTA TCTTTGTCAG GGTTCTTCTT TATGAATTAT GAGATCGAAT CTTCAACTAA 5520 

m 

AATTACGCCT TCATAGCAAG GACATTTCTA TTCAATCACC CTTTAACAGG CATCCAAATT 558 0 

40 

TcTGTAATAT ATTTTTCACT TGTAGTATCA CCAT 5614 
(2) INFORMATION FOR SEQ ID NO: 100: 

45 U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9179 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

so 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

55 
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AAAGACAATG ATATGAAGTA TATGGATATC ACAGAaAAAG TGCCAATGTC GGAATCTGAA 120 

GTTAACCAAT TGCTAAAAGG TAAGGGGATT TTAGAAAATC GAGGGAAAGT TTTTCTAGAA 180 

GCTCAAGAAA AATATGAGGT TAATGTCATT TATCTTGTTA GCCATGCATT AGTAGAAACA 240 

GGTAACGGCA AATCAGAATT AGCAAAAGGC ATTAAAGATG GGAAAAAACG CTATTACAAC 300 

TTTTTTGGTA TAGGAGCATT CGATAGTAGT GCTGTTCGTA GTGGGAAAAG TTATGCTGAA 360 

AAGGAACAAT GGACATCACC AGATAAGGCG ATTATTGGTG GTGCAAAGTT CATTCGTAAT 420 

GAATATTTTG AAAACAATCA ACTGAATTTA TATCAAATGC GATGGAATCC AGAAAATCCT 480 

GCGCAACATC AATATGCGAG TGACATTCGC TGGGCAGATA AAATTGCCAA ATTAATGGAT 54 0 

AAATCCTATA AGCAGTTTGG TATAAAGAAA GATGATATTA GACAAACATA TTATAAATAA 600 

GACATCGGTG CTTAAAGGAG CTGGAACAAT TTATTGTTTC GAGCTCCTTT AGCGCATTCT 660 

20 GAGTGTGTTA GTTAAATGGA TTTTAACCTA ACAAAAAACG CTATATAGCA TCAAATATGC 720 

TATATCCCAC ATCATTGTTA CAAATGTACA TGATGTAAAT GAATATTGCT GTCTAAATGT 730 

GCATGTAATA TACAATGGTG CAGATAATAC ACTTAAGTCC TTAAAAATGA AACGTTAgTT 84 0 

25 CCAAGAGTCA TTTTTAAACA ATAGTGCATG TGATAAAATA GAAAAGAATG AAAAATATAG 900 

AGGTGACAAT ATGAAGATAG CAATTATAGG TGCAGGCATC GGTGGATTAA CAGCTGCTGC 960 

ATTATTACAA GAACAAGGTC ATACTATTAA AGTCTTTGAA AAAAATGAGT CAGTTAAAGA 1020 

AATTGGCGCT GGGATTGGTA TCGGAGATAA TGTGCTTAAA AAACTAGGTA ATCATGACTT 1080 

AGCTAAAGGT ATTAAAAATG CTGGGCAAAT CTTATCTACA ATGACAGTGT TAGATGACAA 114 0 

AGATCGCCTG TTAACTACTG TTAAATTAAA AAGTAATACA TTGAATGTGA CGTTACCACG 1200 

CCAAACATTA ATTGACATTA TTAAATCTTA TGTAAAAGAT GACGCAATAT TTACAAATCA 1260 

TGAAGTCACG CATATAGATA ATGAGACAGA TAAAGTTACC ATACATTTCG CGGAACAAGA 1320 

AAGTGAAGCA TTTGATTTAT GTATTGGTGC TGATGGAATT CATTCTAAAG TGAGACAATC 1380 

TGTAAATGCT GACAGTAAAG TATTATATCA AGGGTATACA TGCTTTAGAG GTTTAATTGA 1440 

TGATATTGAT TTAAAGCATC CGGaTTGTGC AAAAGAATAC TGGGGaAGAA AAGGaAGAGT 1500 

45 AGGTATTGTT CCGTTATTAA ATAATCAAGC ATATTGGTTC ATTACAATTA ACTCGAAGGA 1560 

AAACAATCAT AAATATAGTT CGTTTGGTAA ACCTCATTTG CAAGCATACT TTAATCACTA 1620 

TCCAAATGAA GTTAGAGAGA TCTTAGACAA ACAAAGTGAA ACAGGTATCT TATTGCATAA 1680 

TATTTATGAT TTGAAACCAC TCAAATCTTT TGTTTATGGT CGTACTATTT TACTAGGAGA 174 0 

TGCAGCACAT GCGACAACGC CTAATATGGG GCAAGGTGCT GGACAAGCAA TGGAAGATGC 1800 

55 
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SO 
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5 



10 



25 



TAAAATACGT GTCAAACATA CTGCAAAAGT AATTAAGCGT TCTAGAAAAA TCGGTAAAAT 1920 

TGCCCAATAT CGTAGTCGTT TATTTGTTGC AGTTAGAAAT CGTATTATGA AAATGATGCC 1980 

AAATGCATTA GCAGCTGGAC AAACTAAATT CTTATATAAA TCGAAAGAAA AATAATACAA 2040 

CAATATGAAA ACCCCCGTAT GTTGAAACGA GAGCTCAACA TATGGGGGTT CTTGTTTTTA 2100 

TAATGTTATT ATAATAAATT CAATTATTAG TTAACGACAA ATTGTGGTTT CTCACCTTGA 2160 

ACGGCACTAA TTGCAGCATT AGCAACAATT TTAGACATCA TGTCACGTGC TTCAAATGTA 2220 

GCATTACCAA TATGCGGTGT TAATACTACA TTATTAAGTG ATTTTAAGTC ATCGGTAATA 2280 

15 TCTGGTTCAA ATTCATATAC ATCAAGTGCA GCACCTTCAA TTTCATTATC TTTCAATGCT 2340 

TGCACTAGTG CTTGTTCGTG CACGATTGGA CCACGAGAGG CATTGATTAA ATACGCCGTA 2400 

GATTTCATCA TTTTAAATTG TTCTGTATCA ATTAAATGAT GCATTTTAGG ATTATAAGCA 2460 

20 GCGTTGATAG TGATAAAATC TGCATTCTTT AATAGTGTAT CTAAATCTAC ATATTTTGCA 2520 

CCGATTTCTC GTTCTTTTTC TTCTTTGCGA TTAGGTCCAG TGTATAGCAC ATCCATGTCA 2530 

AATGCTCTTG CACGACGAGC TACTGCACTA CCAATTTCAC CTAAACCGAT AATGCCGATT 264 0 

GTTTTCCCAG ATACTTCTCT ACCTCTGAAA AATAAAGGTG CCCATCCATC AAATCCAGTT 2700 

GTACGTGATA ATTGGTCCCC TTCAACAATA CGACGCGCTA CTGCAAGTAC TAATCCAATT 2760 

GTTAAATCAG CAGTCGCGTT TGTTGATGCT TTAGGTGTGT TTGTAACATC TATACTTTTT 2820 

TCTCGGGCAT ACTCGATATC AATATTATTA AAACCAGCGC CATAGTTGGC AATGATTTTT 28 80 

AAGTCTTTAC CAG CATC GAT AACATCTTTA TCAACGTTTG TAGATAATAA ACTAATTAAG 294 0 

GCAGTCGCGT TTTTAACACC TTTAATTAAA GTGTCTTTAT CGACTAATCC TTTACCTTCA 3 000 

TACATTTCAA CTTCAAAATG TTCTTGTAAA AGTTTTAAAC CTACTTCTGG TATtGCACCA 3 06 0 

gCAACATAAm CTTTTtCCAT AAAAGAtCAC TCCTTTTATC TTAGTATAGT AGAAGATTAG 3120 

40 ACAGTATACA ACTATGTCAT GATGTCTTGT GTATCAATGA TGTAAGCGCG TACTTTTGAT 3180 

GGAGGCGATA TAACTTAGGC ACTGTAGAAC TATGAATATT GTAATGTGGA AAAACTGGAT 3 24 0 

CAATTAAATT AGATAACGTA GTTTTAAAGT TAATAGTATT AGAAAAAATT AATATTTTGA 3300 

ATATGGGAGG AAATATAAAT AAGTAGGTGG CAACGAAAAA TAGCAAAAAA AGAGCTTCTC 3360 

CTATAAAGGA AAGCTCAAAG TTTTTTGATG ACATATGTAC TAGAATTAAG TTTCAAGACA 3420 

ATATGTATCA TCGTGTTTAT ATTAAATATG GATGTAGTTG TAGTTACCTG CTTCACTTGC 34 80 

AGAAATAGTT CTAGAACTTA CTGAGAAAGG TCCGCCACTA TAATTCATTT CTGAAATTGT 354 0 

AACTGAACCA TCACTGTTTA CACTTTCTAC ATATGCAACG TGACCAAATG GTCCTTCAGA 3600 
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AGCAGCAGCC CAATTATTAG 


CATTTCCCCA AGTAGAACCG ATTTCTCCGC 


CAACTTTATC 


3720 




ATATACATAC CAAGTACATT 


GTCCTGCAGT 


GTATAAGTTA 


CCAGAATGTG 


AAATTGATGA 


3780 


5 


TGTAGTTGTC GTAGTTGTCG 


TAGTCGTTGT 


AGTTTGAGTC 


GTGTTGTAGT 


TATAGTTGTT 


3840 




GTAATTTGTA TAATTTTCAG 


CAGCATCTGC 


ATGATGTGCT 


TGACCTACTA 


ATGCTGTGCC 


3900 


70 


GATTCCTGCT GTTAACGTAG 


TTGCTGTTAC 


TAATTTTTTC ATGAATAAAG 


TCCTCCAAAG 


3960 


TTCTATATCT TTTTTTATAA ATAAAACGTA 


GCGACTGTTT 


TATTCTCACA 


TCTCGAATTG 


4020 






AAAATLAATG 


CTTCTTGTGG 


GGAATGTTAT 


TGATTTGTAA 


4080 


75 




TAATTTTGTA 


ATAAAAATTA 


GTCAAAGTTA 


CAATGAGATT 


4140 




AACAGATAAT TAATAGGAAA 


TATTTATTTG 


TAATATGTTT 


AAATAAATCG 


AATTGTTAAA 


4200 




GGTATTATAT ATTCTTGGCC 


ATTATAATAT 


TTGACACACG 


CAATAATTGT 


GAATACAAAA 


4260 


20 


GATAATATTG AGAAAGCGAA 


TATGGATAAA 


ATACCGATAA 


ACGTAATGAT 


GAAACCTATA 


4320 




ATAATAATGA AATCAATATC 


TGTAGCAATT 


AGGAAAACGC 


CTATTAAAGT 


GATAACGACT 


4380 




AAAACGATAG ACCAAATAAT 


ATAAGAAATC 


GTATAGTTAA 


GATAATTTTT 


TCCAGCACGA 


4440 


25 


TCAACTAGTT TCGATTCATC 


TTTTTTCAAT 


AACCATATTA 


TCAGTGGACC 


AATAATAGAT 


4500 




GTGAATAAAC TTAATAAATA 


GATAAGCATC 


GCCATAATGT 


TCTCATCATT 


GGATTTGCGA 


4560 


30 


TTCGGTTGAT GATTTGTTAC 


GTCGTTCATT 


TCAGTTGTCA 


TATTAGACAC 


TCCTTTGAAA 


4620 


ATTGTAATAT TATCTTTAAC 


TATAACAAAA TATAATCAAA AATAAACATG 


TTTATTAAAC 


4680 




AATTATTAAA AATAAAAATA 


ATTGGTGGAC 


GTCGGCGTTT 


AAATAGGTTA 


ATTTAAGGTT 


4740 


35 


ATATATACTT AACATTTATA ATGATGCGTA ATGAATTCGC ATCATTTTTA 


TATTGTCTTA 


480C 




CGTATAATTT GTTTTTAATT 


TTAAC CAAAG 


ATAGAAAGAG 


GGTTGTTTAT 


GAAAATAGCA 


4860 




ATTGTAGGAT CAGGAAA7GG 


CGCAGTTACG 


GCAGCAGTAG 


ATATGGTGAG 


CAAAGGCCAC 


4920 


40 


GATGTTAAAT TATATTGTCG 


TAATCAATCT 


ATAAGTAAGT TTCAAAACGC AATCGAAAAG 


4980 




GGCGGATTTG ATTTTAATAA 


TGAAGGTGAT 


GAACGTTTCG 


TAAAATTCAC 


TGATATTAGT 


5040 




GATGATATGG AATATGTTTT AAAAGATGCT 


GAAATTGTTC 


AAGTGATTAT 


TCCATCTTCA 


5100 


45 


TACATAGAGT ATTATGCTGA 


TGTAATGGCA 


GAGCATGTAA 


CTGATAATCA 


GTTGATATTC 


5160 




TTCAACATGG CTGCAGCAAT 


GGGGTCAATT 


CGTTTTATGA ATGTTTTAGA AGATAGACAT 


5220 . 


SO 


ATTGAAACAA AACCACAACT AGCGGAAgcT AATACGTTGA CGTATGGTAC GCGTGTCGAT 


5280 


TTTGAAAATG CAGCAGTTGA 


TTTATCTCTA 


AATGTACGTC 


GTATCTTCTT 


TTCAACATAT 


5340 




GATAGAAGCT GTCTAAATGA 


TTGTTATGAC 


AAAGTTTCAA 


GTATTTATGA 


TCATTTAGTA 


5400 
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CCAACATTAT TGAATGTCGG TCGCATTGAT TATGCTGGCG AGTTCGCTTT ATATAAAGAA 5520 

GGAATTACTA AACATACAGT TAGATTACTT CATGCAATCG AATTAGAACG TTTGAATTTA 5580 

GGCCGTAGAT TAGGTTTTGA ATTATCAACA GCTAAAGAAT CACGTATTGA ACGTGGTTAT 5640 

TTAGAACGTG ATAAAGAAGA TGAACCATTA AATCGTTTGT TTAATACAAG CCCAGTATTT 5700 

TCACAAATTC CAGGACCAAA TCATGTAGAA AGCAGATATT TAACTGAAGA TATTGCATAT 5760 

GGTTTAGTAC TATGGTCAAG CTTAGGTCGT GTTATTGATG TACCGACACC AAATATAGAT 5820 

GCAGTAATTG TAATTGCATC AACCATTTTA GAGAGAGACT TCTTTGAGGA AGGCTTAACA 5880 

GTTGAAGAAA TTGGTTTAGA TAAGCTTGAT TTAGAAAAAT ATTTAAAATA AATGATGGCT 5940 

TGAAGATAGA AAAGGATATA GCATTATGCA AAAGCAATAA ATTGAAGAAA AGAGGTTTCT 60C0 

CATCAATAAG CGnAGGGGAC GATAGATGAT GAAAAGAAAA CCCACCTTTT TAGAATCAAT 6060 

20 TTCGACAATG ATTGTAATGG TTATTGTTGT TGTAACAGGC TTTGTGTTTT TTGATATTCC 6120 

AATTCAAGTA TTATTAATTA TTGCCTCAGC ATATGCCACA TGGATTGCAA AACGTGTAGG 6180 

CTTAACATGG CAAGATTTAG AAAAAGGCAT TGCAGAACGT TTAAATACTG CAATGCCTGC 6240 

25 AATTTTAATT ATACTAGCGG TAGGAATTAT AGTAGGCAGT TGGATGTTTT CTGGCACAGT 6300 

GCCAGCCTTG ATTTATTATG GCTTAGATTT ATTGAATCCA AGCTATTTTT TAATATCAGC 6360 

CTTTTTTATA AGTGCTGTTA CATCTGTAGC AACTGGTACA GCATGGGGCT CTGCATCAAC 6420 

TGCAGGGATT GCACTTATTT CTATTGGTAA TCAATTGGGG ATTCCTCCAG GGATGGCAGC 64 80 

GGGTGCTATT ATAGCAGGGG CTGTGTTTGG CGATAAAATG TCACCATTAT CAGATACAAC 6540 

TAATTTAGCG GCGCTTGTTA CTAAAGTTAA TATATTTAAA CATATACATT CGATGATGTG 6600 

GACGACGATA CCTGCATCAA TCATAGGTTT ATTAGTATGG TTTATTGCTG GATTTCAATT 6660 

TAAAGGGCAT TCAAATGATA AACAGATTCA AACTTTGTTA TCAGAGCTTG CACAGATTTA 6720 

TCAAATTAAC ATATGGGTCT GGGTTCCCTT AATTGTGATC ATTGTTTGTT TGCTATTTAA 6780 

AATGGCTACA GTGCCAGCTA TGCTAATATC AAGCTTTTCT GCCATTATAG TGGGGACTTT 6340 

TAATCATCAT TTCAAAATGA CAGATGGTTT CAAAGCAACA TTTAGTGGTT TTAACGAATC 6900 

45 AATGATACAT CAGTCTCATA TTTCATCCAG TGTGAAAAGC TTGTTAGAAC AGGGTGGTAT 6960 

GATGAGTATG ACCCAAATAT TAGTAACGAT ATTTTGCGGA TATGCATTTG CAGGTATTGT 7020 

AGAAAAAGCA GGATGTTTAG AAGTCTTATT AACTACTATT TCTAAAGGCA TCCATTCTGT 7080 

AGGAAGTTTA ATATGTATTA CTGTTATTTG TTGTATTGCG CTTGTATTCG CTGCAGGTGT 7140 

TGCTTCGATT GTAATTATTA TGGTCGGTGT GTTAATGAAA GATTTGTTCG AAAAATACCA 7200 
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AATACCATGG GGAACATCAG GTATTTACTA 
ATTTTTCATA TGGACAGTAC CATGTTATTT 
5 TACAGGGATA GGTATTAAAA AGTCATCGAA 

ATATATAATA TGTTGAAACA CTTTAATCAT 
GTTTTAACTT AGAATAAATA TCCTCTATGC 

10 

GTTGATATAT GTAATGTAAG TTTTATGTCA 
TTGAAGGCAA GTATATTTGT AAGTACTTTA 
GACATGCCTA AA7TTGGGTG TGTCAATGGC 

15 

AATCGGATTG GTGAAAATCG AAATTTTGAG 
ATTAAAAAAG CCAACAAGGC TCTTGAAACC 

20 AGTGAATGAA GTTATAACCA GCAGCTTGGC 

CTGGGCCATA ACCATAGTTC ATTTCTGAAA 
CAACGTATGC AACGTGACCG TATGCACCTT 

25 GTGTATTGTT CACTGTGTAA CCAGCTCTTG 

CCCAAGTTGA ACCGATTTTA CCACCTACAC 
AAGTGTATAA GTTACGTCCT GAAGTATAAC 

30 

GAGCCATAGT TGTAGTTACT TGAACATTGT 
CACCAGTACG GTAGCTGTTT GTGTTGTAAC 
TATTATTTGA GTAGTTGTTG TAACGGCTGT 

35 

TGTTATAGTT ATTGTAACCA TTGTAGTAGT 
ATTGACTTGG ATGCCAGTTA CCTTTCCATG 

40 ■ TGTAAGTATA GCTATATGAT GTTGGGTCGT 
AAGCATGAGC TTGATTTCCT GATGCAATTG 
TAGCTGTAGC GATTTTCTTC ATTTTAAAAA 

45 TTTTCGTAAT GTCCGTGTGA CAAAATTAAT 
AAGAAAGACT ATAACAGAAA TTAGCGTCCT 
TGCTAATATC TTGACACAAT AGAATTTTAA 

SO 

ATAACTACGG CATTCTTTGT GAAAACTGAA 
TAATATTACT GAAAATTCTA AATGTATATT 

55 



TACGAATCAA 


CTTCATGTCT 


CTGTTGAAGA 


7320 


ATGCGCAATT ATAGCAATTA TCTATGGTTT 


7380 


TTCACGTTTA ACTTAATGTG AGCGTGGAAT 


7440 


TTATAATTGT AGCGGTTATA ATTTGAAAAG 


7500 


ATATACTGAA 


TATGTTTTGT 


AGCGGAACAT 


7560 


TGATTTGTAA 


TGACTAAATT 


AATTGAGAAT 


7620 


ACTAAAAATT 


TATCAATGTA 


TAGCCGATTT 


7680 


TGTATGTTGT 


TTATTCTTTA 


TTACAGAGTG 


7740 


ATTTTTACCA ATTCGATTTT 


TTTCATAGAA 


7800 


TTGTTGGCGT AAACATAGCC 


AT CACTAATT 


7860 


TAGCTGAGAT 


TGTACGTGAA 


GTTACAACAC 


7920 


CTCTTACTGA 


ACCATTGCTG 


TTAACACTTT 


7980 


GAGTTGTTTG 


CATAATTGCA 


CCAGCTTTTG 


8040 


CAGCTGCGTT 


AGCCCAGTTA 


CTTGCATTGC 


8100 


GATCAAATAC 


GTAGTATGTA 


CATTGACCAG 


8160 


CACTTGAGAT 


TGAACGGCCA 


TTTGATGATG 


8220 


TGCTTGAAGT 


GCTGTAGCTT 


GCACCTAAAC 


8280 


TATTATAGTT 


ATTGTAGTTA 


TATGATTGAT 


8340 


AGTTATTGTA 


GCTATAACCG 


TTGTTGTAAT 


8400 


AATAGCTGTA 


GTAGCCATTA 


TCTTGGTTTA 


8460 


TGTAATGGTA 


GTTACCTTGT 


GCATCAATAG 


3520 


TTGGATTATA 


ACCGTAGTTA 


TCTTGCTCAG 


B580 


CGATTGTAGC 


GAATCCTGCA 


GTTGCGATAG 


8640 


TATCCTCCTA AAAATTTTAA 


ATCTAAAATA 


8700 


GTTATAAGTT 


ATCTCTCGTA 


ATTAAACGAC 


8760 


TGTGTGCTTT 


GTTAACGTTT 


TGTAATTTTT 


8820 


AAGTATAGAA ATTTGCATTT 


TGCAAAACTT 


8880 


TGTTTCGAAA ATAAGTCTGT 


TACAAATTTG 


8940 


TTGTGCATAA 


TATAGGACTT 


TTAATCAGAA 


9000 
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GGATGAAAAT GTATATTTAA TGGATAAAAT ATCCTAATTT AGCATAAAAA AATGTTTTAA 9120 
TAAAAGTATT ATTTGATATA ATCGATTTAT GTTTTGTTAC TGCTAAAAAA CATGTGGCG 9179 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) . SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

15 





CCTTCAGCCA 


TTTGACTTCG 


ACATGAGTTG 


CCTGTACATA 


TAAAATAAAT 


TGTTTTTTTA 


60 




GTCATAACAA 


TCTCCTAATT 


AATTAAAATA 


TGATAAGTGT 


TAGATACAAC 


CCTATGAGGG 


120 


20 


TTATAAATAG 


TACTGGAATT 


GTAATGATGA 


TACCAGTTTT 


AAAGTATGTG 


CCCCAAGAAA 


180 




TCTTAACATC 


TTTTTGtGTT 


AAGACGTGTA ACCACAGTAA 


TGTAGCTAAA 


GAGCCTATCG 


240 




GTGTAATTTT 


TGGACCTAAA 


TCAGAACCGA 


TAACATTCGC 


ATAAATTAGG 


CCTTCTTTTA 


300 


25 


ACATGCCATG 


GACA1TTGAT 


TGACCAATAG 


CAATCGCATC 


TATTAAAACT 


GTAGGCATAT 


360 




TATTCATTAT 


TGATGATAAA 


AACGCTGAAA 


TGAAGCCCAT 


TCCCAAAATA 


GTGCTAAATA 


420 




GACCGTAATT 


GGAAATATAT 


TCTAATATTT 


TAGCCAATAT 


TAAAGTAATG 


CCAGCATTTC 


480 


30 


TTAAGCCGAA 


TACGACGATA 


TACATACCAA 


TTGAAAATAA 


TACTATATTC 


CAAGGTGCGC 


540 




CCTTAATGAC 


TTGCTTAATA 


TTTACAGCAT 


TTGATTTACG 


AGCCAACATT 


AGAAAAATAA 


600 


35 


AAGCAATGAT 


TCCAGTGAAA 


ATTGATACCG 


GAATTTTAGT 


AAATTTACTG 


ATTAGATAGC 


660 


CGAAAAGTAA 


TATAACTAGA 


ACAATCCaTG 


AAATTTTAAA 


TAGCTTTAAA 


TCATTAATGG 


720 




CATCFTTAGG 

• 


ATGCTTTATA 


TTATTATCAT 


CAAACGTTTT 


AGGTATCGCT 


TTTCTAAAAT 


780 


40 


ATAACCACAA 


TACTATAATA 


CTTGCTAAAA 


GCGAGAATAA ATTAGGTATA 


ATCATTCTAC 


840 




TAAAATATCG 


AACGAATCCT 


ACATGAAAAT 


AATCAGCAGA 


TATAATATTC 


ACTAGATTGC 


900 




TCACGATTAA AGGTAAAGAA 


GTTGTGTCAG 


CTATAAAACC 


ACTCGCAATA 


ATnAAAGGGA 


960 


45 


ATATGGCCCG 


CTTACTAAAA 


CCTATATTTT 


TAACCATCGC 


TAATACAATA 


GGCGTTAAGA 


1020 




TTAAcGTGCG 


CCATCATTTG 


CGAAAAATGC 


AGCAACAATG 


GCACCCAATA 


ATATGATATA 


1080 




AACGAACATT 


TTTAAACCAT 


TGCCTTTTGA 


AGCATGAAGC 


ATGTGAATAG 


CTGACCATTC 


1140 


50 


GAATAATCCA 


ACTTTATCTA 


ATATTAATGA AATAAGAATG 


ACTGAGACAA 


AAGTCAAAGT 


1200 




AGCATTCCAA 


ACAATACCTG 


TTACTTCGAA 


AACATCGGAA 


AAACTTACAA 


CACCAGTAAT 


1260 



55 



599 



EP0 786 519 A2 

TAATACAAAT AATAAAGTTA CTAGAAAAAT GAGTGTCGCT AAAGTTGTCA TCATTAGCAT 1380 

TCACCAGTCT TAAGGTTATG ACAAATACAT CGTTGGTTAG AGGTATGAAC CTTAGACAAG 1440 

5 TTATTAATTA CGGACTCAAA AATATTATGA TTgAGCTGGT ATAAATGTTT ATTTCCGATT 1500 

TTTCGTGTCG TAACTAAGTT GGTTTTTACT AATGCTTTCA TATGrTAGCT AAGTGTAGGT 1560 

TGAGAGAATT GAAAATGTGC TAACAAATCA CAAGCGCATA ACTCTCCACA AGAAAGTAAA 1620 

10 

TCTAGTATTT CTAATCTGCT TGAATCTGAT AAAACTTTTA AAAATGTTGC TAGTTCTTTA 16 80 

TACGTCATAA CATACCTCCT AGACGTTAAA TAGATTATCA TCTATATAGA TGAATGTCTA 174 0 

TGTTCCTTTG GTATATTACA CGATATGACT ATGTAATTTA AATTTGGTTT TAGTATTAAA 1800 

75 

AGGGTATTAA AGATAAATTA TAGATATTGA TTTTGCAAAA TATACTCTTT GTTCTGCATT 1860 

GAAAAAGG 1868 
20 (2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15249 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

30 

ATTTATGAAA TCCATAGCnA TAAACATTAT TCTTGCATCG GCTATACAAA CAGTTACCGC 60 

AAGCAAATTT GTATATCAAC CTGGAATTGT GTTCACGTCA ATGGCaAATG CCGATGATGT 120 

GTTATCAGGC GATAGTTATT TTATGGCTGA ATTAAAATCT ATTAAGCGTA TTGTTGAAAT 180 

35 

TCCAGATAAT CAAAAAATAT ACTGCTTTAT AGATGAAATT TTTAAAGGTA CCAACACAAC 24 0 

TGAACGAATT GCCGCTTCAG AATCAGTACT ATCATTTTTA CATGAAAAAT CTAACTTTAG 300 

4Q AGTTATTGCA GCAACACATG ATATTGAGTT AGCTGAACTC TTAAAACAAC GTTATGAAAA 360 

TTACCATTTC AATGAGGTAA TAGAAAATAA TAACATACAT TTTGATTACA AAATTAAGCC 420 

TGGCAAAGCA AATACACGTA ATGCCATCGA ATTATTAAAA ATCACTTCAT TTCCAGCAAA 480 

45 AATATATGAA CGAGCAAAAG ATAATGTCCC GAAAATTTAG CATTTAACTT TAAACATAAA 54 0 

AACGTCAGCT ATCACATGAC AGAAGACTAT GAACAGTTTC AATAATGTTC ATAGTAATCA 600 

TGTTAATAAC TGACGTTTAT TTTATTCTGC AGAATACTCT TCTAAATCTA TATTGCTGTG 660 

50 

CCCATTTAAT GCTAAATCAG CAAATCGACC TTGCTGATAC AAATAGTGGC CGGCAACGCC 720 

TATCATTGCA GCATTATCTG TGCATAATTT AGGACTTGGG ATAGTTAATT GAATGTCATT 780 
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AACAATTAAT CGCTGAACAC CATATTCTTT ACAAGCTTGA ATAGCTTTAA ACGTGAGCAC 900 

CTCTACAACA CTGTTTTGAA AGCTCGTTGC TACGTTAGCT TCAATGATTG GaATATTTTT 960 

TTGTCGTTGA TTGTGAAGTT GATTGATTAC GGCACTTTTC AACCCACTAA AACTAAAATC 1020 

ATAACTATCT TTATCCAACC AAACACGAGG GAATGAATAA GTATCTTCAC CTTCAGCAGC 1080 

CAACCGATCA ACTTGTGGAC CACCTGGATA ATTTAAACCA ATTGTTCGTG CCACTTTATC 114 0 

ATAAGCCTCA CCTACTGCGT CATCTCGTGT TTCACCAATG ACTTCAAATG ATAAATGATC 1200 

CTTCATATAA ACTAATTCAG TATGTCCACC TGAAACAATA AGTGCAATTA GCGGGAATGT 1260 

TAATGGCTCT TCTATGTGAT TAGCATATAT ATGTCCTGCA ATATGATGAA CAGGAATAAG 1320 

TGGCTnATCG TAAGCAAATG CCAATGCTTT GGCTGCATTA ACACCTATTA GTAACGCACC 1380 

AATTAGTCCA GGGCCTTCTG TAACCGCTAT GGCATCAATA TCTTCTATTG ATACATCGGC 1440 

20 ATCCCCTAGA GCCTCGTTTA TTGTTGCTGT TATACCTTCA ACGTGATGTC TACTTGCCAC 1500 

TTCGGGAACG ACACCGCCAA ATCGTTTATG ACTTTCAATC TGACTTAAAA CTGTATTTGA 15 SO 

TAAAATATCT CTGCCATTTT TTATAACACT AACGCTTGTT TCATCACAAC TTGTTTcAAC 162 0 

AGCTAGTATT AATATATCTT TAGTCATTTA AATTCACCCA CATAACCATT GCGTCCTCAC 1680 

CTTCACCATA ATAATTTTTA CGTTTACCAC CATATTGAAA TCCTAAATTT TCATATACAT 174 0 

GTTGTGCCAC TTTATTATTA ACTCTTACTT CTAAACTCAT CACATCACAA GTGTGACTTG 1800 

CATAGTTTAT TCCGTATTTT AAAAGCATTT GACCTAAACC ATAGCCTCTA TAATTATCAT 18 6 0 

CGATTGCAAC TGTTGTAATT TGAGCTTGAT CGATAACAAT CCATAAACCT AAATAACCAA 1920 

TAATTTGTTG TTCAAATTCC AAGACAAAAT ATTTCGCAAA GTTATTTTGC TCTATTTCAT 1980 

GATAAAATGC GTCAATTGTC CAAGAACTGT CATTGAAACT CCGACGCTCA AGATCAAAGA 204 0 

CTTQTGGCAC atcttcttta GTCATCTCTC TAATGTTTAA TTGTTCTTTT GACTGTTGAT 2100 

40 CCAATTTCGT TCCGCCTCAG CTAATTTATG GTATTTAGGA GTAAATGTAT GTACGTCTGA 216 0 

AGGTTTATCT AGCAATTGAT ACATGACTGA TGCATTTGGT AGctGCGCAA TCACTTCACC 2220 

TTGTAATTCA TCTTGTAATT TTACAGTATC TTTCCCAATA TAAATAAATG GTTGGTTTAA 2280 

ATCTTCTAAA AAAGCTCGCA ATGCCTCTAT CGACATATAT TGATCTTCTA AAATAGTCAC 234 0 

TAATTGACCA TTTTGCCACT GGAATATGCC TGTATAAACT GCTTGTCGTC TTGCATCAAA 2400 

CACAGGAACC AATAATTTAT CAGTATGATC GATTGTTGCT GCCAATGCCT TTAATGATGA 2460 

AACACCATAT AATTTAACAT CTAACGCATA CGCTAATGTT TTAGCAACAG TAACACCGAT 2520 

ACGTAAGCCA GTATATGAAC CAGGACCTTC AGCAACAATA ATCGCATCTA ATTGCTGTTT 2580 
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TTGTTTAGAA TCCGTAGTTA TTTCAGCTAA AACTTCATCG TTTTGCATCA ATGCTACTGA 


2700 




TAATGGTTGA TTCGATGTAT CAATGAGCAG CGAATTCATG GATAATTGCC 


TCCTTAATTT 


2760 


5 


GTTCATAATG TTCTCCTTGC GCGAACAACT CAATTTGTCT TGTATTTTCA GATATTGTTG 


2820 




AAATGTTAAT AGATAAATGC GTCGCTGGAA GTAAATCTTT TATAAATTGA 


CTCCATTCAA 


2880 


10 


TAACAGTAAT TGCCTGATCT TCGAAAAATT CATCAAATCC TAAATCTTCA TCAGAATCTT 


2940 


CTAAGCGATA ACAATCCATA TGATGCAATT TTAAATTTTT ACCCCTATAT 


GATTTAATGA 


3000 




TGTTAAATGT CGGGGAATTA ATCGTACGTC TTACACCAAG AGCTTTTCCT 


ATAAATTGCG 


3060 




TTAACGTTGT TTTACCTGCT CCTAAATCTC CGTTAAGTAA AATCAAATCA 


CCACTTTTCA 


3120 




ATTGCTCAAC TAAAAATATA GCAAATTGAT TCATTTCATC TAAATTATTT ATCTTTATCA 


3180 




ATGTTGATTC TCCTATATTA TGCTTTTCAT TCATAAAAAT GATTATCCAT 


TGTTCAATCG 


3240 


20 


TATCTAACTT TATATTTAAC CTTTATATTG TAACAAATTT CAACTTAAAT 


TTCTTATCTT 


3300 




TGAAACAGAT TATCTATTCA AAGTTAATTG TAAGAAAATT TAAAATATTT 


GTTGACATAC 


3360 




TAAAGCAGAT ATAGTAAATT AAATTTATCA AATTTTTAGA CAATTCTAAC 


TATTAAAGTG 


3420 


25 


ATATATACCA TTCACGGAAG GAGTATAATA AAATGCTTAA TCAATATACT 


GAACATCAAC 


3480 




CGACAACTTC AAATATTATT ATTTTATTAT ACTCTTTAGG ACTCGAACGT 


TAgTAAATAT 


3540 


30 


TTACTAAACG CTTTAAGTCC TATTTCTGTT TGAATGGGAC TTGTAAACGT 


CCCAATAATA 


3600 


TTGGGACGTT TTTTTATGTT TTATCTTTCA ATTACTTATT TTTATTACTA 


TAAAACATGA 


3660 




TTAATCATTA AAATTTACGG GGGAATTTAC TATGCGAaCG AgcATGATCA AAAAAGGAGA 


3720 


35 


TCACCAAGCA CCAGCAAGAA GTCTTTTACA TGCCACGGGC GCGCTAAAAA 


GTCCAACTGA 


3780 




TATGAACAAA CCATTTGTAG CTATTTGTAA CTCTTATATT GATATTGTTC 


CTGGACATGT 


3840 




7CACTTGAGA GAGCTTGCAG ATATAGCTAA AGAAGCAATT AGAGAAGCCG 


GTGCCATTCC 


3900 


40 


ATTTGAATTC AATACAATTG GTGTTGATGA TGGAATAGCT ATGGGACATA 


TCGGAATGCG 


3960 




ATATTCTCTA CCATCACGTG AAATTATTGC AGATGCAGCT GAAACTGTAA 


TTAACGCTCA 


4020 




TTGGTTTGAC GGCGTATTTT ACATTCCTAA TTGTGACAAG ATTACACCCG 


GTATGATTTT 


4080 


45 


AGCAGCCATG AGGACAAACG TACCAGCTAT CTTTTGCTCT GGTGGACCAA 


TGAAAGCTGG 


a i a n 




CTTATCTGCA CATGGAAAAG CATTAACAC7 TTCATCAATG TTTGAAGCAG 


TCGGCGCATT 


4200 


50 


TAAAGAAGGA TCGATTTCTA AAGAAGAATT TTTAGATATG GAACAAAATG 


CCTGCCCTAC 


4260 


TTGTGGTTCA TGTGCTGGGA TGTTTACTGC AAATTCAATG AACTGTTTGA TGGAAGTTTT 


4320 




AGGTCTAGCA TTACCATACA ACGGTACTGC ACTTGCAGTC AGTGATCAGC GACGAGAAAT 


4380 
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TATCGTTACT CGCGAAgCAA TTGATGATGC 
AACAAACACG GTACTGCATA CGTTAGCCAT 
5 AGAGCGCATT AATGCTATTG CCAAACGCAC 

ATCGTATTCA ATGCATGATG TGCATGAAGC 
GATGAAGAAA GATGGCACGT TACACCCAGA 

10 

TGAAAATAAC GAAGGCAAAG AAATTAAGAA 
ATATGATGCA CAAGGCGGTT TATCTATCTT 
TATTAAAGTT GGCGGCGTTG ATCCATCTAT 

75 

CAATTCGCAT GATGAAGCTG TTGAAGCAAT 
CGTTGTCATT AGATATGAAG GACCTAAAGG 

20 TACTTCCTCT ATTGTTGGTC GCGGCTTAGG 

TTTTTCCGGT GCCACAAGAG GTATTGCAGT 
TGGACCAATT GCCTTAATTG AAGATGGTGA 

25 ATTAAACGTA AACCAGCCTG AAGATGTTCT 

TAAAGCGAAA GTAAAAACAG GTTATCTAGC 
TACAGGTGGC GTCATGCAAG TCCCTGAGAA 

30 

GGTTAAAATG TCTAAAACTC AACATGAAGT 
TGAATCACTT GAACCTGAAC AACTAAATGA 
AGAAGTGCTA GTAGAAGCTC TACTTAAAGA 

35 

TGGTGCCGTA CTACCTTTAT ATGACACGTT 
AAGflfCACGAA CAAGGTGCTG TTCATGCTGC 

40 GGCGTCGTTG TAGTTACAAG CGGTCCaGGT 

GCACATTGCG ACTCTTTACC TCTAGTTGTA 
GGTAAAGATG CATTCCAAGA AGCGGATATT 

45 AATTATCAAG TGAAACGTGT TGAAGATATC 
GCTAATTCTG GACGCAAAGG TCCTGTAGTG 
GCTACAAATG TGGATTTATG CGACGAAATC 

SO 

CCAGAAAATA AAGACATTGA CACTTTCATC 
GTATTAGCCG GCGCAGGTAT TAATCAATCA 

55 



ATTTGCACTT GATATGGCTA TGGGTGGTTC 4 500 

TGCCAATGAA GCTGGTATTG ATTATGACTT 4 550 

GCCATATTTA TCAAAAATAG CACCTAGTTC 4620 

TGGTGGCGTC CCAGCAATTA TTAATGAATT 4680 

TAGAATCACA GTTACTGGCA AAACGTTACG 4740 
CTTTGATGTC ATTCACCCTC TTGATGCACC . 4 800 

ATTTGGTAAT ATCGCCCCTA AAGGCGCAGT 4860 

CAAAACATTT ACTGGGAAAG CAATTTGTTT 4920 

AGACAATCGT ACCGTTCGTG CAGGCCACGT 4 980 

TGGACCAGGT ATGCCTGAAA TGTTAGCACC 5040 

TAAAGATGTT GCATTAATTA CTGATGGGCG 5100 

TGGTCATATT TCCCCTGAAG CTGCATCTGG 5160 

TGAGATTACT ATTGATTTAA CAAATCGTAC 5220 

AGCGCGTCGC CGAGAATCTT TAACACCATT 5280 

TCGTTATACT GCCCTAGTAA CTAGCG CAAA 5340 

TTTAATTTAA TTTATTTTTA TATTGGAGAT 5400 

AAACCAAAAT ATTGACCCTT TAAAAATGGC 54 60 

AAAAACTTTA AATGATATGC GTTCAGGATC 5520 

AAATGTGGA7 TATTTATTCG GTTATCCTGG 5580 

TTATGATGGT AAAATCAAAC ATATTTTAGC 5640 

AGAAGGTTAT GCACGTGTAT CTGGTAAamT 5700 

GCAACTAATG TAATGACAGG TATTACGGAT 5760 

TTCACTGGAC AAGTTGCTAC ACCAGGCATT 5820 

CTATCTATGA CTTCACCAAT TACAAAACAA 5880 

CCTAAAATCG TACACGAAGC TTTCCATGTA 594 0 

ATTGATTTTC CAAAAGATAT GGGTGTTTTA 6000 

AATATTCCAG GTTATGAAGT TGTTACAGAA 6060 

TCACTTTTAA AAGAAGCGAA AAAGCCTGTC 6120 

AAATCAAATC AATTATTAAC ACAGTTTGTT 6180 
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GATACACTAT 


TTTTAGGTAT 


GGGAGGAATG 


CATGGTTCTT 


ATGCTAGTAA 


CATGGCATTA 


6300 




ACTGAGTGTG ATTTACTCAT TAATTTAGOT AGCCGCTTCG ATGATAGATT AGCAAGCAAA 


63S0 


5 


CCTGATGCCT 


TTGCACCTAA 


CGCCAAAATT 


GTACATGTAG 


ATATTGATCC 


TTCAGAAATC 


6420 




AATAAAGTTA 


TTCATGTAGA 


TTTAGGTATT 


ATTGCAGACT 


GTAAAAGATT 


TTTAGAATGT 


6430 


10 


TTAAATGATA AAAATGTTGA 


GACTATAGAA 


CACAGTGACT 


GGGTTAAACA 


TTGTCAAAAT 


6540 


AATAAGCAGA AACACCCATT 


TAAACTTGGT 


GAAGAAGATC 


AAGTATTTTG 


TAAGCCACAA 


6600 




CAAACAATCG 


AATATATCGG 


CAAAATTACA AATGGTGAAG 


CAATTGTTAC 


TACAGACGTG 


6660 


15 


GGACAACATC 


AAATGTGGGC 


AGCTCAATTT 


TATCCATTTA AAAATCACGG 


ACAATGGGTT 


6720 




ACAAGCGGTG 


GTTTAGGAAC 


AATGGGATTC 


GGTATTCCTT 


CGTCAATTGG 


TGCCAAATTA 


6780 




GCTAATCCTG 


ATAAAACAGT 


CGTATGTTTC 


GTCGGTGACG 


GTGGTTTCCA 


AATGACAAAC 


6840 


20 


CAAGAAATGG 


CACTTTTACC 


CGAATATGGT 


TTAGATGTCA AAATCGTACT 


AATCAATAAT 


6900 




GGAACATTAG 


GTATGGTTAA 


ACAATGGCAA 


GATAAGTTCT 


TTAATCAACG 


CTTCTCACAC 


6960 




TCAGTATTTA 


ATGGTCAACC 


TGA'rrTTATG 


AAAATGGCAG 


AAGCATATGG 


CGTCAAAGGT 


7020 


25 


TTCTTAATCG 


ATAAGCCAGA 


ACAACTGGAA 


GAACAATTAG 


ATGCAGCGTT 


TGCTTATCAA 


7030 




GGACCAGCTT 


TAATTGAGGT 


TCGTATTTCC 


CCTACTGAAG 


CTGTAACCCC 


AATGGTTCCG 


7140 




AGTGGCAAAT 


CAAATCATGA 


AATGGAGGGC 


TTATAATGAC 


AAGAATTCTT 


AAATTACAAG 


7200 


30 


TTGCGGATCA 


AGTCAGCACG 


CTAAATCGAA 


TTACAAGTGC 


TTTTGTTCGC 


CTACAATATA 


7260 




ATATCGATAC 


ATTACATGTt 


ACACATTCTG 


AACAACCTGG 


GATTTCTAAC 


ATGGAAATTC 


7320 


35 


AAGTCGATAT 


TCAAGATGAT 


ACATCACTTC ATATATTAAT TAAAAAATTA AAACAACAAA 


7380 


TTAATGTTTT 


AACGGTTGAA 


TGCTACGACC 


TTGTTGATAA 


CGAAGCTTAA 


TTTTAAGACA 


7440 




aaggCaatga TGCGCTAATT AGTTATAGAT atatcatagg 


CTGCTAGTTA 


ACATCTGCCA 


7500 


40 


CTATTACAAA 


GTTATATTTC 


AGAA'ITITCG 


AAACACAAAA 


TATTTAATTA 


TTTGGAGGAA 


7560 




TTTATTATGA 


CAACAGTTTA 


TTATGATCAA 


GATGTAAAAA 


CGGACGCTTT 


ACAAGGCAAA 


7620 




AAAATTGCAG 


TAGTAGGTTA 


TGGATCACAA 


GGTCACGCGC 


ATGCACAAAA 


CTTAAAAGAC 


7680 


45 


AATGGATATG 


ATGTAGTCAT 


CGGCATTCGC 


CCAGGTCGTT 


CTTTTGACAA 


AGCTAAAGAA 


7740 




GATGGATTTG 


ATGTGTTCCC 


TGTTGCAGAA 


GCAGTTAAGC 


AAGCTGATGT 


AATTATGGTG 


7800 




CTATTACCTG 


ATGAAATTCA 


AGGTGATGTA 


TACAAAAACG 


AAATTGAACC 


AAATTTAGAA 


7860 


SO 


AAACATAATG 


CGCTTGCATT 


TGCTCATGGC 


TTTAACATTC ATTTTGGTGT 


TATTCAAC CA 


7920 




CCAGCTGATG 


TTGATGTATT 


TTTAGTAGCT 


CCTAAAGGAC 


CGGGTCATTT 


AGTTAGACGT 


7980 
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10 



15 



CAAGCACGTA ATATTGCTTT AAGTTATGCA AAAGGTATTG GTGCAaCTCG TGCAGGTGTT 8 LOO 

ATTGAAACAA CATTTAAAGA AGAAACTGAG ACAGATTTAT TTGGTGAACA AGCAGTACTT 8160 

TGCGGTGGTG TATCGAAATT AATTCAAAGT GGCTTTGAAA CATTAGTAGA AGCGGGTTAT 8220 

CAACCAGAAT TAGCTTATTT TGAAGTATTA CATGAAATGA AATTAATCGT TGATTTGATG 8280 

TATGAAGGCG GTATGGAAAA TGTACGTTAC TCAATTTCAA ATACTGCTGA ATTTGGTGAC 8 340 

TATGTTTCAG GACCACGTGT TATCACACCA GATGTTAAAG AAAATATGAA AGCTGTATTA 8400 

ACTGATATCC AAAATGGTAA CTTCAGTAAT CGCTTTATCG AAGACAATAA AAATGGATTC 8460 

AAAGAATTTT ATAAATTACG CGAAGAACAA CATGGTCATC AAATTGAAAA AGTTGGTCGT 8520 

GAATTACGCG AAATGATGCC TTTTATTAAA TCTAAAAGCA TTGAAAAATA AGATAGACCT 8580 

ACAATGAGGA GTTGTTAAAT ATGAGTAGTC ATATTCAAAT TTTTGATACG ACACTAAGAG 3640 

20 ACGGTGaACA AACACCAGGA GTGAATTTTA CTTTTGATGA ACGCTTGCGT ATTGCATTGC 8700 

AATTAGAAAA ATGGGGTGTA GATGTTATTG AAGCTGGATT TCCTGCTTCA AGTACAGGTA 8760 

GCTTTAAATC TGTTCAAGCA ATTGCACAAA CATTAACAAC AACGGCTGTA TGTGGTTTAG 8820 

25 CTAGATGTAA AAAATCTGAC ATCGATGCTG TATATGAAGC AACAAAAGAT GCAGCGAAgC 8880 

CGGTcGTGCA TGTTTTTATA GCAACATCAC CTATTCATCT TGAACATAAA CTTAAAATGT 8940 

CTCAAGAAGA CGTTTTAGCA TCTATTAAAG AACATGTCAC ATACGCGAAA CAATTATTTG 9000 

ACGTTGTTCA ATTTTCACCT GAAGATGCAA CGCGTACTGA ATTAC CATTC TTAGTGAAAT 9060 

GTGTACAAAC TGCCGTTGAC GCTGGAGCTA CAGTTATTAA TATTCCTGAT ACAGTCGGCT 912 0 

ACAGTTACCA TGATGAATAT GCACATATTT TCAAAACCTT AACAGAATCT GTAACATCTT 9130 

CAAATGAAAT TATTTATAGT GCTCATTGCC ATGACGATTT AGGAATGGCT GTTTCAAATA 924 0 

GTTTAGCTGC AATTGAAGGC GGTGCGAGAC GAATTGAAGG CACTGTAAAT GGTATTGGTG 9300 

AACGAGCAGG TAATGCAGCA CTTGAAGAAG TCGCGCTTGC ACTATACGTT CGAAATGATC 9360 

ATTATGGTGC TCAAACTGCT CTTAATCTCG AAGAAACTAA AAAAACATCG GATTTAATTT 9420 

CAAGATATGC AGGTATTCGA GTGCCTAGAA ATAAAGCAAT TGTTGGCCAA AATGCATTTA 9480 

45 GTCATGAATC AGGTATTCAC CAAGATGGCG TATTAAAACA TCGTGAAACA TATGAAATTA 954 0 

TGACACCTCA ACTTGTTGGT GTAAGCACGA CTGAACTTCC ATTAGGAAAA TTATCTGGTA 9600 

AACACGCCTT CTCAGAGAAG TTAAAAGCAT TAGGTTATGA CATTGATAAA GAAGCGCAAA 9660 

TAGATTTATT TAAACAATTC AAGGCCATTG CGGACAAAAA GAAATCTGTT TCAGATAGAG 9720 

ATATTCATGC GATTATTCAA GGTTCTGAGC ATGAGCATCA AGCACTTTAT AAATTGGAAA 9780 
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AAGAGGGTCA 


TATTTACCAG 


GATTCAAGTA 


TTGGTACTGG 


TTCAATCGTA 


GCAATTTACA 


J J V \J 




ATGCAGTTGA 


TCGTATTTTC 


CAGAAAGAAA 


CAGAATTAAT 


TGATTATCGT 


ATTAATTCTG 


9960 

? if W \J 


5 


TCACTGAAGG 


TACTGATGCC 


CAAGCAGAAG 


TACATGTAAA TTTATTGATT GAAGGTAAGA 


x v w ^ v 




CTGTCAATGG 


CTTTGGTATT 


GATCATGATA 


TTTTACAAGC 


CTCTTGTAAA GCATACGTAG 


looftn 


10 


AAGCACATGC 


TAAATTTGCA 


GCTGAAAATG 


TTGAGAAGGT 


AGGTAATTAA 


TTATGACTTA 


1 01 4 O 


TAACATTGTT 


GCCCTACCTG 


GTGATGGAAT 


CGGTCCAGAA 


ATTTTGAACG 


GATCTCTATC 


i n?no 




ATTGCTTGAA ATTATAAGTA ATAAATATAA 


CTTTAATTAT 


CAAATAGAGC 


ACCACGAATT 


±\J z o u 


15 


TGGTGGTGCC 


TCTATTGATA 


CATTCGGCGA 


GCCTTTAACT 


GAGAAAACCT 


TAAATGCGTG 






TAAAAGAGCA 


GATGCTATTT 


TACTGGGTGC 


AATCGGTGGA 


CCTAAATGGA 


CAGATCCTAA 






CAATCGACCA 


GAACAAGGAT 


TATTAAAATT 


GCGTAAATCC 


TTAAATTTAT 


TTGTAAATAT 


± U <± *± U 




ACGCCCCACT ACCGTTGTCA 


AAGGCGCTAG 


TTCTTTATCA 


CCTTTAAAGG 


AAGAACGCGT 






TGAAGGCACA' 


GATTTAGTTA 


TAGTCCGTGA 


ATTGACAAGT 


GGTATTTATT 


TTGGAGAACC 






TAGACATTTT 


AATAATCACG 


AGGCCTTAGA 


TTCTCTTACT 


TATACAAGAG 


AAGAAATAGA 




25 


ACGCATTGTT 


CACGTAGCAT 


TTAAATTGGC 


CGCTTCAAGA 


CGAGGAAAAC 


TAACATCAGT 






TGATAAAGAA 


AATGTATTAG 


CTTCTAGTAA 


ATTGTGGCGC 


AAAGTCGTAA ATGAAGTAAG 


1 074 O 




TCAATTATAT 


CCAGAAGTAA 


CAGTAAATCA 


CTTATTTGTT GATGCTTGTA GTATGCATTT 




30 


AATCACAAAT 


CCAAAACAAT 


TTGACGTCAT 


CGTATGTGAA 


AACTTATTTG 


GCGATATrTT 


1 0 0 

XvuvU 




AAGTGATGAA 


GCTTCAGTGA 


TTCCTGGTTC 


ACTTGGTTTA 


TCACCTTCTG 


CTAGTTTTAG 




35 


TAACGATGGT 


CCAAGATTGT 


ATGAGCCTAT 


TCATGGATCA 


GCACCAGATA 


TTGCAGGTAA 


1 n Q Q o 


AAACGTTGCC 


AATCCATTTG 


GAATGATTCT 


ATCTTTAGCG 


ATGTGTTTAC 


GTGAAAGCTT 


1 1 04 0 




AAATCAACCA GATGCTGCAG ATGAATTAGA ACAACATATT TATAGCATGA TTGAACATGG 


l 1 100 


40 


GCAAACGACA 


GCAGATTTAG 


GCGGCAAATT 


GAATACTACT 


GATATTTTCG 


AAATTCTATC 


111S0 

.X. A. X> U w 




TCAAAAAT7G 


AATCACTAAG 


GGGGAGATGT 


AAATGGGTCA AACATTATTT 


GACAAGGTGT 


11220 




GGAACAGACA 


TGTGTTATAC 


GGGAAATTGG 


GCGAACCGCA 


ACTATTATAC 


ATTGATTTAC 


11290 


AC 

45 


ACCTTATACA 


TGAAGTTACT 


TCTCCTCAAG 


CATTTGAAGG 


ACTTAGGCTT 


CAAAACAGAA 


11340 




AATTAAGACG 


CCCAGATTTA 


ACATTTGCAA 


CACTCGATCA 


CAATGTTCCT 


ACTATTGATA 


11400 




TATTCAATAT TAAAGATGAA ATTGCAAACA AACAAATCAC AACATTACAA AAAAACGCCA 


11460 


SO 


TAGATTTTGG 


GGTGCATATT 


TTTGATATGG 


GTTCTGATGA 


ACAAGGTATT 


GTTCACATGG 


11520 




TAGGACCTGA 


GACAGGACTT 


ACACAGCCTG 


GCAAGACAAT 


CGTTTGTGGT GACTCTCACA 


11580 
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ATGTTTTCGC AACTCAAACG 


CTATGGCAAA 


CAAAACCCAA 


AAACTTAAAA 


ATCGATATTA 


11700 




ATGGTACCTT ACCAACAGGC 


GTCTATGCTA AGGACATTAT 


TCTGCATTTA 


ATTAAAACGT 


11760 


5 


ATGGTGTTGA CTTTGGTACA GGCTATGCTT TGGAATTTAC TGGCGAAACA ATTAAAAACC 


11820 




TTTCAATGGA TGGTCGAATG ACTATTTGTA 


ACATGGCTAT 


CGAAGGTGGT 


GCCAAATACG 


11880 




GCATAATCCA ACCTGATGAT 


ATAACATTTG 


AATATGTTAA 


AGGGAGACCA 


TTTGCCGATA 


11940 


10 


ACTtCGCTAA ATCAGTTGAT AAGTGGCGTG AgCTATATTC TGATGACGAC GCGATATTTG 


12000 




ATCGTGTAAT TGAACTTGAT 


GTTTCAACAT 


TAGAACCACA AGTGACATGG 


GGAACTAATC 


12060 


15 


CTGAAATGGG TGTTAATTTC 


AGTGAACCAT 


TCCCTGAAAT 


CAATGATATC AACGATCAAC 


12120 


GTGCGTATGA TTATATGGGG 


TTAGAACCAG 


GTCAAAAAGC 


TGAAGACATC 


GACTTAGGGT 


12180 




ATGTTTTTCT CGGTTCATGT 


ACAAATGCTA 


GACTATCAGA 


TTTGATTGAA 


GCTAGTCATA 


12240 


20 


TTGTTAAAGG AAATAAAGTT 


CATCCAAATA 


TTACAGCTAT 


TGTCGTACCA 


GGTTCTCGTA 


12300 




CAGTAAAAAA AGAAGCAGAA 


AAATTAGGTC 


TAGATACTAT 


CTTTAAAAAT 


GCAGGATTTG 


12360 




AATGGCGTGA ACCAGGATGT 


TCAATGTGTT 


TAGGCATGAA 


TCCTGACCAA 


GTACCTGAGG 


12420 


25 


GCGTACATTG TGCATCTACA AGTAATCGAA ACTTTGAAGG ACGACAAGGC AAAGGTGCAA 


12480 




GAACACATTT AGTATCCCcT 


GCTATGGCAG 


CAGCAGCAGC 


TATTCATGGT 


AAATTTGTGG 


12540 




ACGTAAGAAA GGTGGTTGTT 


TAAATGGCAG 


CAATCAAACC 


TATTACAACA 


TATAAAGGTA 


12600 


30 


AAATAGTCCC TCTCTTCAAC 


GACAATATCG 


ATACAGACCA AATCATTCCT 


AAGGTACACT 


12660 




TAAAGCGTAT TTCAAAAAGT 


GGCTTTGGTC 


CATTTGCTTr 


TGATGAATGG 


CGGTACTTAC 


12720 


35 


CTGATGGTTC AGATAATCCT 


GATTTCAATC 


CTAACAAACC 


ACAATATAAA 


GGGGCTTCTA 


12780 


TTTTAATTAC TGGAGATAAT 


TTTGGATGTG 


GTTCAAGTCG 


TGAACATGCT 


GCTTGGGCTC 


12340 




TTAAGGACTA TGGTTTTCAT 

• 


ATTATTATTG 


CAGGAAGTTT 


CAGTGACATA 


TTTTATATGA 


12900 


40 


• 

ATTGCACTAA AAATGCGATG 


TTGCCTATCG 


TTTTAGAAAA AAGTGCCCGT 


GAACATCTTG 


12960 




CACAATATGT TGAAATTGAG 


GTCGATTTAC 


CAAATCAAAC 


TGTGTCATCA 


CCAGACAAGC 


13020 




GTTTCCATTT TGAAATTGAT 


GAAACTTGGA 


AGAATAAACT 


TGTAAATGGC 


TTAGATGACA 


13080 


45 


TTGCAATCAC CCTACAATAT 


GAATCATTAA 


TAGAAAAATA 


TGAAAAATCa 


CTTTAAGGGA 


13140 




GTTGAATATT ATGACAGTCA AAACAACAGT 


TTCTACGAAA 


GATATCGATG 


AGGCATTTTT 


13200 




AAGACTTAAA GATATTGTCA 


AAGAAACACC 


TTTACAATTA 


GACCATTACT 


TATCTCAAAA 


13250 


50 


GTATGATTGT AAAGTCTATT 


TAAAACGAGA 


AGATTTACAA 


TGGGTACGTT 


CTTTTAAATT 


13320 




AAGAGGTGCT TACAACGCTA 


TTTCTGTTTT 


ATCAGATGAA 


GCTAAAAGTA 


AAGGTATTAC 


13330 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



AAACGCTGTT ATCTTTATGC CAGTCACTAC ACCTTTACAA AAGGTAAATC AAGTAAAGTT 13500 

CTTTGGAAAT AGTAACGTTG AAGTTGTACT CACTGGTGAT ACATTTGATC ACTGTTTAGC 13560 

TGAAGCTTTA ACTTATACAA GTGAACATCA AATGAACTTT ATAGATCCAT TCAATAATGT 13620 

TCATACAATT TCTGGACAAG GTACGCTTGC TAAAGAAATG CTAGAACAAG CAAAGTCTGA 136 SO 

CAATGTTAAC TTTGATTATC TATTTGCCGC AATTGGTGGT GGCGGTTTAA TTTCAGGTAT 13740 

TAGTACTTAC TTTAAAACCT ATTCACCTAC CACGAAAATT ATAGGTGTTG AACCTTCAGG 13 80 0 

TGCAAGTAGT ATGTATGAAT CTGTTGTGGT AAATAATCAG GTAGTCACAT TGCCTAATAT 13860 

CGATAAATTT GTGGACGGTG CATC TG TAG C TAGAGTTGGC GATATTACAT TTGAAATTGC 13920 

AAAAGAAAAT GTAGATGATT ACGTTCAAGT AGATGAAGGT GCAGTTTGTT CTACGATTTT 13980 

AGATATGTAT TCAAAACAAG CAATTGTAGC AGAACCTGCT GGCGCATTAA GTGTAAGTGC 14040 

GCTTGAAAAC TATAAAGATC ATATTAAAGG TAAAACAGTG GTTTGTGTCA TTAGTGGTGG 14100 

TAATAATGAT ATTAATCGAA TGAAAGAAAT TGAAGAACGT TCATTACTAT ACGAAGAAAT 14160 

GAAGCATTAC TTTATCTTAA ATTTCCCTCA ACGTCCAGGT GCATTGAGAG AATTTGTAAA 14220 

TGACGTATTA GGACCTCAAG ACGATATTAC TAAATTTGAA TACTTAAAAA AATCTTCTCA 14280 

AAATACAGGT ACTGTCATTA TTGGTATTCA ACTTAAAGAT CATGATGATT TAATACAACT 14340 

CAAACAACGT GTAAAtCATT TCGATCCTTC CAATATTTAT ATTAATGAAA ATAAGATGTT 14400 

ATATTCATTG TTAATTTAAC ACATAGTAAG AAAAACAGTC ATAAATTGAT TTCTAATTGA 144 60 

AATCATCTTA TGACTGCTTT TTATTATACT TTACATTTCT CGTTTCGTCA GATTCAAACG 14 520 

TTTTCACTTC GCCAAGCCAT CTTTCTTTGT GTTTGCTTTT aTTTTGACGT TTTAGACATA 14 580 

AAAAAaGAGA CCTTGCGGTC TCAATGCGGC TCATCGCATC CACTTTTTGC CTGGCAACGT 14 640 

TCTACTCTAG CGGAACGTAA GTTCGaCTAC CATCGACGCT AAGGAGCTTA ACTTCTGTGT 14700 

TCGGCATGGG AACAGGTGTG ACCTCCTTGC TATAGTCACC AGACATATGA ATGTAATTTA 14760 

TACATTCAAA ACTAGATAGT AAGTAAAAGT GATTTTGCTT CGCAAAACAT TTATTTTGAT 14 820 

TAAGTCTTCG ATCGATTAGT ATTCGTCAGC TCCACATGTC ACCATGCTTC CACCTCGAAC 14 880 

CTATTAACCT CATCATCTTT GAGGGATCTT ATAACCGAAG TTGGGAAATC TCATCTTGAG 14 940 

GGGGGCTTCA TGCTTAGATG CTTTCAGCAC TTATCCCGTC CACACATAGC TACCCAGCTA 15000 

TGCCGTTGGC ACGACAACTG GTACACCAGA GGTATGTCCA TCCCGGTCCT CTCGTACTAA 15060 

GGACAGCTCC TCTCAAATTT CCTACGCCCA CGACGGATAG GGACCGAACT GTCTCACGAC 15120 

GTTCTGAACC CAGCTCGCGT ACCGCTTTaA TGGGCGAACA GCCCAACCCT TGGGACCGAC 15180 
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GTGGAACTT 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14051 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



15249 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



15 


GTGGCAATAT 


TTCTAGTTCT 


CGTTTTGATA 




TGTCCTGATT 


TGAATTAGAT 


ACAAATTCAT 




CATATGTTTC 


ACCTTTATAT 


ACAGTTCGAA 


20 


TTTTCAATAT 


GTAACCTTTC 


GCACCATTAC 




CAAACATTGT 


TAATATTAGT 


ATTTTAGTTT 




CGATAAGACC 


TGACTCACCT 


GGTGGCATAC 


25 


ATTCCATTAC 


TTTTTGGTAA 


GCTTCGACGC 




CATTTTGATA 


ATTTAAAATC 


ATAGAGAACC 




TGACTATTTT 


CAA'rm'ATT 


CCCCCAATGT 


30 


TTGGTACCCT 


CACCAATTTT 


CGTTTCAATA 




TCATTCATTC 


CATATAAACC 


GAGTCCAGAA 


35 


TTTCCCGCAT 


CTATCACTTC 


TGCTACCAAA 


ATTTCATTTA 


CATCAGCGTA 


TTTCAACGCA 




ACAACCGTTT 


CAATATCACT 


ATCAAAGCGA 


40 


TTTATTCCAT 


AATTTTCTTC 


AAACTGTTTA 




AGATCATCCA 


AAGAAGCGGG 


TCTTAATTCA 




TTAGCGACAA 


TATATTCAAT 


ATTTTCTGCG 


45 


TATTTTAATA 


ATCTCAATTG 


AACATCTACA 




AACTCTCTAG 


AAATTCGCTT 


TCTTTCATTT 




CGTTGTTGAT 


GCAATTTCTC 


TTGCTGTTCA 


50 


GCATGAATTC 


CCCTGTCTTG 


ATCAATCAAC 




TGATCTTTCG 


TCTTCATAAA 


TACTTGGAAA 



60 
120 
180 
240 
300 
360 
420 
430 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1030 
1140 
1200 
1260 
1320 
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0 

ATCGCATTCG CCACAGCACT GTAATTATCT TCTTCAGATA ATATATCTTT AGCAGCATCA 1440 

TTCATTGCAA TAATTTTACC GTTATCATCA GCAAAAACTA TCTTTTCGAT TGAATGCTCA 1500 

5 TAATATTTTT TCAATAAAGT ATCTAACTGT ATACTGTCCT CATTAATCAT GACTTACACC 1560 

CTAATTCATC TCATTATTTA TCATCATTGA AAATACCAAA CTTACGTTGA ATATCATCAT 1620 

TATCAAATAT TTTTGGTAAA GGACGACCAT CTCTTTGACC AAATAATAGT ACGCCATACA 1680 

10 

CTTGATTCTT ATACCAAAGC GGCACTGCTA AAACTGCTGT TAATGATTCG CTCAATAAAA 1740 

TTGGATAGTC AATCTTTTCT TCAGGCCCTA AAGCTAAACC AACATTGGCT ATTACCATAC 1800 

GCTTTCCTGT TTTCATAACA GTTCCAGCTA ATCCACGACC TTTTCTTAAA ATAATCAATT 1860 

15 

TAAATCGATT ATTTTTATTA CCTGAAACAT AGTGCCATTT TATTGGAGAT GATGGTTTGT 1920 

TAGATTCATA GAAAGCGATT GCCGCAAAAT CATAACCCTC TTCTTTGCGT ATTTTATCTA 1980 

2Q ATGTCTCTTG AAATCTACGA TCTTCAATTA TTGCTTCTGG TGTCAAATCC TTTCACCTCT . 204 0 

TATGCTTACA CTTTATTCTT ACGGTAAATA ATATATCTGC GATT7ATATA TGTCAAAGGT 2100 

ACACTCCAAA CATGCACCAA ACGTGTAAAT GGCCAACAAG CCATAATAGT GAAACCTAAC 2160 

25 AATATATGCA TTTTAAATGC AATCGGCACA CCACTCATCA ATGACGCATC TGGTTTTAAC 2220 

ATAAATAATT GTCTAAACCA AATTGATAAT GAAGTTCTGT AGTTAAAGTC TGGATGTTGT 22 80 

ATATTTGTTA CTAATGTTGC GTAACATCCC ATAAATACGA TAAGTAATAA TAAGAAATTT 2340 

30 

ACAAATATAT CCGACGCTGA ACTTAATCTT CGAATACTTT TCGTAGTAAC ACGTCTCGCT 2400 

GTTAATAAAA ACATCCCTAT CAAAGTTATT ATACCAAAGA TGCTACCAAT ATAAACAGCG 2460 

CCTATATGAT ATAAATGCTC AGACACACCC ACTGCATCCA TCCATGGTTT CGGTATTAAC 2520 

35 

AATCCAACTA CGTGTCCAAA AAACACTGGA ATAATACCTA AGTGAAATAA TAAACTTCCC 2580 

CACATCAACC TTTTTCTTTC TATTAATTCA CTAGATTTAG CTGTCCAAGA AAATTTATCA 2640 

TAACGATAAC GTGCAATATG ACCTGCGACA AAGACAACTA AACATAAATA CGGAAATATA 2700 

40 

ACCCATAAAA ACTGATTAAG CATGATGTTT CACTCCTTTT GGTGATGTCA AACATAATTT 2760 

CAATGTTTTT CTAAGTGCTT GAATCACATA GGCATATGGA TTGTTATCTT CACCAAGTGC 2820 

45 ATTCGCCATC ACATATGTTC CATCCTCAAT AATCATAATG ATTAATTGAA TATTCTCTTC 2880 

AGCTCTTGGA TCATTTCGCC ATTCTGCCAC TTGCAAAAAT TGAAGCATCA ACGGTAGATA 2940 

ATCAGAAAGT TCATTATCTA CCATTTCTAG TCCAAACATT TCATATAATA CCTTTAATTT 3000 

50 AGCTAACATT TGCCCACGTT CTTTTTGCGT ATCAAATTTG TTATACGTCA TATATAATGG 3060 

TGCTTTTTTC GTAAAATCAA ATGTATCTGT ATAAATCGCT TTGATTTCTG ATAATGAAAA 3120 
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i 
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TGTTTCTTCA AAAGTTTTTG GATGAAAAGT 
CATATATCCA AAACTTTCTT GATATTTTTT 
5 CTCCATAGAA ATTCTCATTA TAAATTTCTT 

CACAGCCTTC ACAGTTATCT CCAAAATGCT 
GTGCGTGATA CGTATCTAAA TAGGTTTCTT 

10 

CATATTTGGC TAGTCCTAAT AAACGATACA 
ATCGCTCTAA TCGAGACGTG TCAAATGGCT 
TCATCATTGC CATACGTTGT AGGGCTCCTT 
TATTAGCTAA GTATTCAATA GGTAAACGCA 
GATTTTGAGT TGTATTTTTA CCTTCAAAAT 

20 ACCAAACCAT CGGCATCGTT CTAAATTCAG 

TTGCTAACTT ATAAATTGGA GAGTTTTGTG 
CTTTTTCAGC TTGAG CAATG ACTTCTTCGT 

25 TTTCATATAA ATCTTTCTCG TCTACTGCTG 

ATAATAAAAC ACCTAAGTAA CGCATACGTC 
TACCCGCCTC GATT CTCGGG AAACAGAAAG 

30 

TGAAGTAAAC TTTCTTATAT GGACAACCTG 
CTTGGTCAAC TAATACAATG CCATCTTCAT 
ATGCAACGCA ACTTGGATTC AAGCAATGTT 

35 

TTTCGTCAAA TTGGAATTTA ATATCTTCTT 
CTGTAACATG ACCACCTGCT AAGTCATCTT 

40 TATCCCCCGT AATTTCTGAA TACGCTCTAG 

TTGTTAAATG TTCATAATTA TAGTTCCATG 
CTGGGTTATA AAAAATTTTA CCTAAAGCAA 

4S CAAGTTTCCC TTTACGATTT AGTACCCAAC 

GTTTCGGATA CCCTACACCT GGCtTCGTTT 
CTGGACGATT TGTCCaAGTG TTTTTACATG 

SO 

TATCTAAATT TAATACCATC GCAAcTTGCG 
ATCTTTCTAA CTGCTACATA TAAATCCCTT 
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TAATTTTTCT 


GGAAAACATA 


ACTGTTGTGC 


3240 


AAAATTATCG 


AAATTAATCA 


CGGAAAATCC 


3300 


GACCAGTTTT 


CCCTGAACCT 


ACTGCAACGC 


3360 


CGCCGCCGTA ATTGTATCCT 


GTACTACC7T 


3420 


TGTGTGATGT 


TGGAATAACA 


AATCGATCTT 


3480 


TGTCTTTAGT 


TTGGCGCTCG 


GTTATACCTA 


3540 


GTTGAGTAAC 


TTGAGATCTC 


ATATAACTTC 


3600 


TTACTGGCTC 


TGTATCTCCT 


GCAGTGAAAA 


3660 


TTTCTTCAAT 


GGCTGGGAAA 


ATCGCATCTG 


3720 


AGCTCATAAT 


TGGGCTAAGT 


GGTGGGCAAT 


3780 


GATGTAACGG 


AAATGCAAGT 


TTATATTCAA 


3840 


CAGCTTCAAT 


CCAATCGTAA 


CCAATACCAT 


3900 


CAAATGGGTT 


TAAGAATATA 


TCTAATTGTT 


3960 


AAGCTGCTTC 


ATGAACTCGA 


TCTGCATCAT 


4020 


CTGTACAAGT 


TTCAGAGCAT 


ACCGTAGGCA 


4080 


TACACTTTTC 


AGCTTTGTTC 


GTTTTCCAAT 


4140 


TCATACAGTA 


ACGCCATCCA 


CGACATGCGT 


4200 


CACGTTTATA 


CATAGCACCT 


GAAGGACACG 


4260 


CACATAAACG 


TGGTAAATAC 


ATCATAAAAG 


4320 


CTATTTTTTG 


GATGTTAGGA 


TCTTTTGGAC 


4330 


CCCAGTTAGG 


TCCCCATTCA 


ATTTCAATGT 


4440 


CAACTGGCGA 


ATGCTTCCCT 


GATTTCGCAG 


4500 


GCTCATAATA ATCTTTAATT 


AATGGCATAT 


4560 


TTTTTGAAAT 


TCTACTTCCA 


GATTTTAATT 


4620 


CACCTTTGTA 


GTGTTCTTGG 


TCTTCCCAAC 


4680 


CTACGTTGTT 


GAACCACATG 


TACTCAGCAC 


4740 


TCACACTACA 


CGTATGGCAT 


CCTATGCATT 


4800 


CTTTAATCTT 


CAAGCCAATT 


AACCTCCTTC 


4860 


TGGTTCCCAA 


TTGGTCCATA 


ATAATTAAAG 


4920 
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GGCGCGTTGT GTGAACCACC ACGTGTATCT GTAATTTCTG ACCCAGGCGT TTGAATATGT 
TTATCTTGTG CATGATACAT AAACATTGTA CCTTTAGGCA TACGATGCGA AATAACTGCT 
CTTGCCGTTA CAACACCATT ACGGTTATAC ACTTCTAGCC AATCATTATC TTGGATATCG 
TGTTTTTCAG CATCTTCATT TGATATCCAA ACCGTTGGAC CACCTCTAAA TAGTGTCAAC 
ATATGCTTAT TATCTTGATA CATTGAGTGT ATATTCCATT TTCCATGAGG CGTTAAATAA 
CGCAgTACCA AAGCATCTGT ACCACCTTTA ATTTTCTTAT CTCTATTCCC AAATACCATT 
GGCGGCAATG TCGGTTTATA TACTGGTAAG CTCTCCCCAA ATTGTTGGAA AACTTCGTGA 
TCCACATAAT AACTTTGACG TCCTGTTAAT GTTCTAAAAG GTACTAGACG TTCTATATTC 
GTTGTAAATG GTGAATATCG TCGACCTTGT TTATTTGAAC CTGGGAATAC TGCTGTCGGT 
ATTACTTCTC GTGGTTGTGA AGTTATATTT AAAAACGAAA TTTTCTCAGC AGCGCGTTCG 
CTAGAAATAT CTTTTAACGG CATTCCAGTT TGTTCTTCGA GATCTTCATA TGATTTTTGT 
GATAATTTAC CATTCGTAGC AGATGAAATA CTTAGTATTG CATCAGCTAC ATTACGTGCT 
GTATCAATAC GTGGACGATT CGCTCTCACA GAATCATCAT TTGTATCACT CCACGTACCT 
AACATACTTT TTAATTCTTC ATATTGTTCA CTGACACCGA AACTTACACC ATGTGCTCCA 
ACTTTCCCTT TTTCAAGTAC AGGACCAAGC GTGACATATT TGTCGTAAAT TTTAGTGTAG 

+ 

TCGCGTTCTA CAATTGCAAA GTTAGGCATT GTACGTCCAG GTACCGCTTC AATTTCACCC 
TTCGACCAAT CTTTCACTAC GCCGTATGGT GTTGAAATTT CTTGCTTTGT ATCATGACTA 
AGTGGAGTTG TCACAACATC TTTAAACGTT CCAGGTAAAT AGTCTTTTGC CATTTCTGAA 
AATGCTTTTG CCAACGTTTT ATAAATATCC CAGTCTGAAC GCGATTCCCA TAACGGATCA 
ATGGCAGGAT TGAAAGGATG TACATATGGA 7GCATATCCG TTGATGATAA ATCATGTTTT 
TCATACCAAG TCGCTGCCGG CAAAACAATG TCAGAATATA ACGGTGTTGC CGTCATTCTG 
AAGTCTAAAG AGACCACTAA AT CTAACTTA CCTGTTGTTT CTTCACGCCA CGTAATTTCT 
TCTGGCTTTT CATCTTCATT TGGTGTAGCT AATAACCCTG ATTTTGTGCC AAGTAAATGC 
TTCATAAAGT ATTCTTGACC TTTTGCAGAA CTTGAAATTA AGTTTGAACG CCATATAAAT 
AATGATTTTG GATGATTCTT TTTCAAA7CA GGATCTTCTA TTGCAAATTG TGTTTGTTTT 
GATTTCACTT CATCAATTGC ACGTTGCAAA ATCGCTTCAT TTGAATCTAT ACCTTCATCT 
TTAGCTTCTT CTGCAAACAA CAAACTATTT TTATTAAATT GTGGATATGA TGGTAACCAA 
CCAAGTCTAG CTGCTAAAAC ATTATAATCA GCTGGATGTT GATGCTTTAA CTCCTCTGTT 
TTAGCTAATG GAGAT7TTAA ACGATCTACA TTTGACTCTT CATATTTCCA TTGGTCTGTT 



5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6130 

6240 

6300 

6360 

6420 
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6600 
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6720 
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AATGCGACAG 


TACTCCATCC TTCAATCGGA 


CGACATTTTT 


CTTGTCCCAC 


ATAGTGAGCC 


6840 




CAACCGCCAC 


CATTCACACC 


TTGACAGCCA 


CATAACATAA 


CTAAGTTTAA 


GATTGAACGA 


6900 


5 


TAAATCGTAT 


CTGAGTTAAA 


CCAATGGTTA 


ATACCCGCAC 


CCATGATAAT 


CATTGAACGC 


6960 




CCTTCAGTAT 


CGATAGCGTT 


TTGCGCAAAT 


TCTTTCGCTA 


CTTGAATGAC 


AACACTTTGT 


7020 


10 


TTTACGCCTG 


AAATGGCTTC 


TTGCCAAGCA GGTGTATATT 


TTGATTCTGC 


ATCGTCGTAT 


7080 


CCTTTTGATT 


CTAATTTATG 


A7CAAAACGA 


CGCACGCCAT ATTGACTTGC 


CATTAAGTCA 


7140 




AAAATTGTAG 


CAATACGGAC 


TTTGTCACCA 


TTTGCTAAAG 


TGACTTGTCG 


AGTTGGAATT 


7200 


15 


GGACGATTGA 


ATATCCCATC 


TCCATCACTA 


TCAAAGTATG 


GGAATTGAAT 


TGTTTCTAAT 


7260 




TCGTATCCAC 


CTTCTGTCAT 


TGATAATGTA 


GGGTTAATTT 


TAGAACCATC 


TTCTGTTTCT 


7320 




AGTTTTAAGT 


TCCACTTCTT 


ACCTTCTTCC 


CAACGTTGAC 


CCATTGTGCC 


ATTAGGTACT 


7380 


20 


ACTAAACTAT 


CGCTGATTGC 


ATCATGAATA 


ACTGGCTTCC 


ATTCGCCTTG 


CTCTGTTGTT 


7440 




TGACCTAAGT 


CACTCGCTCT 


TAAAAATCGA 


CCCGCTTTAT 


ATCCATTTTC 


ATCTTCATCC 


7500 




AGCATGATAA 


GAAACGGCAT 


ATCTGTATAT 


TGTTTAGCGT 


AATTTATAAA 


GCGTTCATTA 


7560 


25 


GGTTGATTAA 


CATAATGTTC 


TTGTAAAATA 


ACATGCGTCA 


TTGCTTGTGC 


AATTGCAGCA 


7620 




TCTGAACCAG 


GATTCGGTGC 


TAGCCAGTTA 


TCTGCAAATT 


TCACATTTTC 


TGCGTAATCT 


7680 


30 


GGTGCTACTG 


AAATGACTTT 


TGTACCI-1TA 


TAGCGGACTT 


CAGTCATAAA 


ATGTGCATCC 


7740 


GGAGTACGTG 


TTAAAGGTAC 


ATTAGAGCCC 


CACATAATAA 


TGTATGATGC 


GTTATACCAG 


7800 




TCACTTGATT 


CAGGCACATC 


TGTTTGCTCT 


CCCCAAATTT 


GTGGAGAGGC 


AGGTGGTAAA 


7860 




TCTGCATACC 


AGTCATAAAA 


ACTAAGCATT 


TCACCACCAA 


GCAAATTGAT 


GAATCGAGCA 


7920 




CCTGCTGCAT 


AACTAATCAT 


TGACATCGCT 


GGAATAGGTG 


TAAATCCTGC 


GATTCGATCT 


7980 




GGACCATATT TTTTTATTGT 


ATACAGTAAT 


TGTGCTGCGA 


TTATCTCTGT 


AACGTCTTTC 


8040 


40 


CAATTTGAAC 


GCACGTGCCC 


TCCCATACCT 


CGGGCTTGCT 


TATATTGTTT 


GGCTTTGTCT 


8100 




TCATTTTCAA 


CAATAGACGC 


CCATGCAGCA 


ACGCGATTAC 


CATTGTTTTC 


TTCTAATGCT 


8160 




TCAGTCCATA 


AATCCCAGAG 


TTTTCCACGA 


ATATATGGAT 


ATTTGATTCG 


AAGCGGACTG 


8220 


45 


TATTCATACC 


AAGAGAATGA 


CGCACCTCGT 


GGACATCCTC 


TCGGTTCATA 


TTCAGGCATA 


8280 




TCCGGACCAC 


AACTTGGATA 


GTCAGTTTGT 


TGATTTTCCC 


AGGTAATCAC 


ACCATTTTTC 


8340 


SO 


ACAAATACTT 


TCCAAGAACA 


TGAGCCTGTA 


CAGTTAACAC 


CATGTGTTGT 


TCTTACTTCT 


8400 


TTATCGTGGC 


TCCAACGTTC 


TCTGTACATT 


TTTTCCCATT 


CTCTACTTTT 


ACTTTCTAGG 


8460 




ATCGACCAAT TCCCATTAAA TTTTTCTGTT GGCTTAAAGA AATTCAATCC AAATTTTCCC 


8520 



So 
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TAAAATGCCC AAGACTATTG CTTTAATTAG ATTGTACATT TTTTCACAAA CATAAAATAT 864 0 

TAGGGAATCA CCTAATTACT TAAGGAATTT CCCTATCAAT AACGGGATTT CATTGAAATA 8700 

ATACACAATC ATGTATGGTC ATGCTTATTG CCAATCTAAA TCGTTCAAAT TTGGCACAAC 8760 

GACAAATAAG GCTTCAACAC GAATATATTC TCTCGGTTGA AACCTTACTT ATTCATTTAT 8820 

TTTTTATAAA TTAGTGACAT AACACTGTAT TAGCATCTGC ACGATCGGTT GAAATATATG 8880 

TTACATTTTC TTGCTGCTTA ATAAATGCAT CATAGTAATC ATATTGCGAC GAATGATATG 8940 

TGCCATTCGA TGTATCATTT GGGTTTAGCA AACAGCCATA ACCTTCGTCA TATAAATGTT 9000 

CACAGAGCAT AAGGGCGTCA TGTTTAGAAC CACTTACTAC ATAAAATTGC TTCATAGGAT 9060 

CATATGATTT AGGAGTGTTT TCAGTATAAT CAACAACTTC CCCTATAATA CATATACCTG 9120 

GTTTCGCCTC AATTGAATAG TGTTGCAATT TTGAAATAAT ATTACTTAAA CGCCCCTTAA 9180 

CAACAAACTC GTTAAAACAC GATGCTTGAA AGACAATCGC TATCGGGTAA TCAATATCTG 9240 

TGTATTGTTG TATCTGTGTG ATAATTTTCC CTAAACGTTT TACCCCCATA TAAATTGCTA 9300 

ACGTGCCACC ATTCACTAAG GAATTGACAT CCACTTCATT TTCTTCTGAA TCTTTAAAGT 9360 

GACCTGTAGA AAATGTCACA CTTTTAGCAA CTGTACGCAT TGTCAAACCT GTCTGCATAG 9420 

TAGCAACTGc tGCGCTCGCT GATGTCACCC CTGGTACAAT TTCAAACGCA ATATGATGTT 9480 

CATTTAGTAT GTCGACTTCT TCTTGCACAC GACCAAATAT CGCTGGATCG CCACCTTTAA 9540 

GTCTAACAAC CTTGTTATAT CGACGCGCTG CTTCCACGAT ACAGTCATTT ATTTTTTCTT 9600 

GCTGAATATG TTTTGCATAC GGCTTTTTAC CAACATCGAT AATTTCAGTA GTCAAATTCG 9660 

CATATTGTAA AATTAACGGA TTCACTAATC GATCATATAG AATGACATCC gCTTCACGTA 9720 

TTAAACGCTC AGCCTTTTTC GTCAAATAAT TCGGATTACC TGGACCCGCA CCTATCAAGT 9780 

AAACCTTGCC ATATTCCTCT ACAGACATAT ATATACGTTC CCGTCTGTAA CTTCTACCTC 9840 

ATAAACATCT ACACAACCTT CATCAGGTTC TTGAACAATA CCTGTATTTA AATCAATTTT 9900 

TTGATCGTGG AGCGGGCAAA ATACATATTC CCCACTCACT GTCCCTTCAG ACAATGGTCC 9960 

TTGTTTGTGT GGACAGATAT TGTGAATCGC ATGAATTTTG CCACTTTCTG TTAAAAACAA 10020 

CCCTACCTCT TTGCCTTTGA CAATAACCTT TTTTCCAATT AGGGGTGTTA ATTCATCTAT 10080 

AGTTGTCACT TTAATTTTTT CTTTTGTTTC CATGTATTAC ACCTTCTCCA CTTCAAAAAT 10140 

TCTACGTGCT TGAGCATTGC TAGTTATTGC TTCCCAAGGT TCAGCTTCGA CTGCTTTTTT 10200 

AGCATCCATA ATGCGTTCAA ATAGTTCATT TTGTCTTTCT GGGTCAAGTA AGACTTCTTT 10260 

TACATTTTCA AATCCAAGTC TTCTTAACCA TGGCGCTGTT CTTTCAGCAT ATATACCTGT 10320 
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AGTTGTTAAA AATTCAGCTT TTTCAACTTC 
GAATCCATTT TCAACTGAGA TAATACCAAA 
5 TGGGCAGCCT GATACACCCA TTTTGAATTT 

TTCTAAACGA ATGCCAAGTC GTGTCGTGTA 

AACACAGCTT TTAACTGAGC GTGTTTTCTT 

10 

GTCTTCCCAT ATATTTGGTA ATTCTTCTTT 

ACCTGTCACT TTAACTAGTG GCACATGATA 

TTGGTCTGCA TCTCTAACAC CCCCACGCAT 

15 

TTGAATATTC GCATGGTAAC GTTCGTTAGC 
ATGTGGATAA ACCATGTTTA AATAATAGTT 

2Q TTTATTTTTA AAGTTTAAAA CATGTCGAAC 

TATTTGCGTT ACTATTTGAT CGCGTGTCAA 
TGCGGCAACA AAGTCATCTC CTAAGGTGTG 

25 TTTACCACAT GAATTCCCCG CTTTTGTTTT 

ATT7TCCGTA ATCGCATTTA CTATAGTACC 
TTCATCATCA GCCATATCAG CAATTGATAG 

30 

TGATACAAGT GTGTAATCTT CAGTGGATTC 
ACCATCATCG ATATCACCAT ATAGTACTGC 
TTTTTTATAG TTATTATCAA CACTATTAAA 

35 

TACAATTTGA CCAGCACTAT ACAAGTCACA 
TGATCCCTTG TATCCGTTCG TTTCTTTATT 
TTCATATAGT GGTGCAACGA GTCCATAAAC 

40 

TGCATATACA TTGCTATCAC TTGT7TGCAT 
ATCTAGACCT GATTCTTTGG CTACTTCTGT 

45 TAAGTCTGCC GGAATCTCGC GTCCATCAGC 

TAAGATTTCA GTTGTGTTGG CTTGCATTTC 
TTTAAGCATA TTTCCAGCTT TACGGTCTAG 

50 TAACACCGTT ACTTCCATAC CTTGATCTAA 
TCCTCCACCA ATTACAATTG CTTTCTTTTT 

55 



TGTACCACCA 


TTACCACCGA 


TATAGATTTG 


10440 


ATCTTTAACA 


CCTGATTCAA 


CACAACTTCT 


10500 


ATGAGGTGTA 


TCGATGTATT 


CAAATGTTTT 


10560 


TTGCGTACCA 


AATCGACAAA 


ACTCTTTACC 


10620 


ACCATAAGCT 


GATGcTGAAC 


GCATACCTAG 


10680 


TTTAACTCCA 


TACAAACCAA 


CACGTTGTGA 


10740 


TTTCTTAGCC 


ACTTCTCCTA 


GACGAATCAG 


10800 


TTGAGGTATA 


ACAGAAAATG 


TACCATCATT 


10860 


AAATCTTGAT 


TCTCTTTCAT 


CTTCATGATC 


10920 


GATTGCTGGT 


CGACATTTTG 


GACATCCACC 


10980 


TTCTTTAGAT 


GTTTTTAAAC 


CTTTCGCTCT 


11040 


ATCAGTACAA 


CCACATATAC 


CAGCAGGTTT 


11100 


CTGCAATATT 


TGAGCAATTT 


GCGGTTTACA 


11160 


AGCCGTTACT 


TCTTCAACTG 


TTGTAAAGCC 


11220 


TTTATCAACA 


CCATTACAAC 


CACAAATTGT 


11280 


CGATGCCTCT 


TCTCCACCTT 


TAGTAAGCAA 


11340 


ACCTTTTTTC 


ATCATGTTAT 


AAAAGCGTGA 


11400 


ACCAACTACA 


TTACCGTCTT 


TTAAAAAGAT 


11460 


TATTTCAATA 


CCTTTAATTT 


CTGCATTTTC 


11520 


CCCAGAAACT 


TTTAATGACG 


TAAATGTTGT 


11580 


TGTTAAATGA 


TCAGCTAATA 


CTTTACCTTG 


11640 


TTTGCCGTTA 


TGTTCTGCAC 


ATTCACCAAC 


11700 


CACATCATTG 


ACAACAATAC 


CACGATTAAC 


11760 


GTATGGTCGT 


ATACCTACTG 


CCATAACAAC 


11820 








llooO 


AAACTTCATA 


CCTTGCTTTT 


CTAGATCTGC 


11940 


TTGCATTTCC 


ATCAACCATT 


CAGCTAAATG 


12000 


TAAACCAOGT 


GCACACTCTA 


AACCTAGTAA 


12060 


AGTCTTAGCA 


ATGTTCATCA 


TTTGTTCAGT 


12120 
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GAATGCTTTA GAACCTGTCG CAAAAATCAA 
AGTAGTAACT GATTGATTTG CTCTATCTAC 
5 GATACCATGT TCCTCATACC ACTCATATGG 

ATTTTGTAAA ATATTTGAAA GCATGATGCG 
TACCGTAATA TCATATAAAT CGTTGGCGCG 

10 

CGCCATACCG TTACCAATCA TTACTAGTTT 
CCATAATATT TATTTCAAAA AAAGGTATTA 
GGAATCATTA AGCTTTCTAA TCTATCGTTA 

15 

ATTGAAGGTG TGAAGTGTAT ATCTGTATTA 
TTGTTAACAA GTCTTCCGTC ATATAAAAAT 

20 TTCGAGATGC TTTCTAAATC ATGTGTAAAA 

GTCGGCTTGC TAATTTGCAA ATTTTGAGCG 
AACTTTCCAT TAATATTGCC GTGTGCAACA 

25 GCTAATGCGT CACAAATACG TTGTTCAATT 
GGCTCGCTTA CTTCTACCTT TATGTCTGGA 
GGTATATCCT TGAGATAATG CATTGCACTA 

30 

TCAACCCCAC TTTGAATCaA CGTCGTCaTT 
AAAAACGCAA TATCATAGTG ATGTATATCA 
AACGCTTGaT TCTGTCGTCC GTGCCTCATG 

35 

TTTACCAACC CTTTCACACG TATTGTATAC 
CATTATAATG TAAAATCAGG GAATTCCCTG 

40 TTTTCCCTTT TTGTTAAATC AAAAAAAGCG 
TTTGAGCAAG CATTAATATA TCGGTCGCTT 
TTGGCCTAAT ATTGTTTCGT CAAAGCGCTC 

45 TAAATCGCCA TCATCATTTT CATGTTCGCT 

TTTAAGTAAC CACGGATGCA ATCTTGCAGA 

CGTATCTCGC AAAAATGCTT CTTCAACATA 

50 

TTCATACTCA GGATTTGTCG CAAACCACCA 

ACTTCCCCAA GGATATCTAA CCGTAATCGT 

55 



TTTATCGTAT GATACTTCAA TACCATTTGC 12240 

TTCAATTACA GGATCATTTG TAATTAACTC 12300 

ATTCATAATT GTTTCTTCAA CTGTCATTTT 12360 

GTTATAGTTT GGATAAGGTT CTTTACCTAT 12420 

CTCTAATATT TCTTCGATTG TTCGAATGCC 12480 

TTGCTTTGCC ATAAAATATG CCCCTTTACT 12540 

ATTTTTCGTT AGTGCTTTTA TATTTTCATT 12600 

ATGATTTGCT TTAAAATTGG GTCGAAGTTA 12660 

ATAACCATGT CATTCATTTG CTGCTTCACT 12720 

AATGGTACGA CAATCAATTT TTGATACCGT 12780 

CTAATCTCTC CATATAGCGT TCTCGCATAT 12840 

CATATTTGTA ACTCTTCGTG TGCCTTAGTA 12900 

ACCATAACTC CAACTTGTTG TTCGTCACCT 12 960 

AATCGTCTCA TTAAAGGATG TGTGCCAAGT 13020 

TACCGTCGTT TCATTTCATG AACGATATTC 13080 

AAGATTAGCA ATGGTACAAT TTTAAAATGG 13140 

ACCGTCTCTA AATCCtGATG CTCACTTTCt 13200 

TCTTTTACTA ATTCAGAAAT AAATGCTTCT 13260 

CCATGTGCAA CAATGATATT CCCATTCACA 13320 

CAAATCATTT TGTTTTTGTG AAAAGAATCA 13380 

ATGCCTGTAG TCATG CATAT TCCTTATACA 13440 

ACCGATATAT GAATCCCTAC TCAACATTTA 13500 

GTAGTGTATA TTATTATCTT AAAATGGTGG 13560 

GGGTATCAAT ACTTTGCGCA TGATCACACC 13620 

GTATATTTCA TAACCTCTTT TTTCATAAAT 13680 

TGTACCTAAA GTAACTGCCG CTGACTTTAA 13740 

AGTAAGTAAT TGGCTACCAT AGCCTTTCCC 13800 

GACAAAAGGA TAACCCGAAA TACTTTTCAC 13860 

AGATATAATT TCATCATCAA TTGTCATGAC 13920 
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CCAATCAATA CCTAGTTCTC TTAGAgGCGT AAATGCTTCA TGCATGAGTT CTTGCAATTT 
TTCTGCATCT T 



14040 
14051 



10 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1835 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 



15 



20 



25 



30 



35 



40 



45 



SO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
TAATCCTCAA CTTnGATTAT ATGGCTTGGG CGCATATGAA CTGCTTAGTT TAGTGTATGA 
CATTCATACA GTTCGCATGA CTATCATACA ACCTCGAATA GATAACTTTT CTACTGAAGA 
GTTACCAATC TCAAGATTAC TTCAATGGGG AACCGATTTT GTTAAACCCT TAGCCAGACT 
TGCTTATAAC GGTGAAGGTG AGTTTAAAGC AGGTAGTCAT TGTAGATTCT GTAAGATAAA 
GCATTCATGT AGAACACGTG CAGAATACAT GCAAAATGTG CCTCAAAAGC CACCACATTT 
GTTGAGTGAT GAAGAGATTG CAGAACTTTT ATATAAACTG CCTGATATCA AAAAATGGGC 
TGATGAAGTA GAGAAATATG CGTTAGAACA AGCGAAAGAG AATGATAAAA CGTATCCAGG 
TTGGAAGCTA GTCACGGGAC GTTCAAGGAG AGTGATAACT GATACAAAAG CAGTCCGAGA 
CAGGTTAGTT GAAGCGGGTT ATAAACCTGA AGATATTACA GAAACCAAGT TACTTAGCAT 
TACGAATTTA GAAAAATTAA TCGGCAAAAA AGCATTTTCT AAAATTGCAG AAGGCTTTAT 
AGAAAAGCCG CAAGGTAAAT TAACACTTGC TACCGAGTCT GATAAACGAC CAGCTATAAA 
GCAATCTGCT GAAGATGATT TTGACAAACT ATAAAAATTA AAAAGGACGG TATATAAACA 
TGA^AGCAAA AGTATTAAAT AAAACTAAAG TGATTACAGG AAAAGTAAGA GCATCATATG 
CACaTATTTT TGaACCTCAC AGTATGCAAG AAGGGCAAGA AGCAAAGTAT TCAATCAGTT 
TAATCATTCC TaAATCAGAT ACAAGTACGA TAAAAGCCAT TGAACAAGCT ATAGAAGCTG 
CTAAAGAAGA AGGAAAAGTT AGTAAGTTTG GAGGCAAAGT TCCTGCAAAT CTGAAACTTC 
CATTACGTGA TGGAGATACT GAAAGAGAAG ATGATGTGAA TTATCAAGAC GCTTATTTTA 
TTAACGCATC AAGCAAACAA GCACCTGGTA TTATTGACCA AAACAAAATT AGATTAACGG 
ATTCTGGAAC TATTGTAAGT GGTGACTATA TTAGAGCTTC AATCAATTTA TTTCCATTCA 
ACACAAATGG TAATAAGGGT ATCGCAGTTG GATTGAACAA CATTCAACTT GTAGAAAAAG 
GCGAACCTCT TGGCGGTGCA AGTGCAGCAG AAGATGATTT TGATGAATTA GACACTGATG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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TTGAGGTGTC AAGAATTTGA AATTTATGAA TATAGATATT GAAACATACA GCAGTAACGA 1380 

TATTTCGAAA TGTGGTGCCT ATAAATACAC AGAAGCTGAA GATTTCGAAA TTTTAATTAT 1440 

5 AGCTTATTCG ATAGATGGTG GAGCGATTAG TGCGATTGAC ATGACTAAAG TAGATAATGA 1500 

GCCTTTCCAC GCTGATTATG AGACGTTTAA AATTGCTCTA TTTGACCCTG CTGTAAAAAA 1560 

GTATGCATTC AATGCTAATT TCGAAAGAAC TTGTCTTGCT AAACATTTTA ATAAACAGAT 1620 

10 

GCCACCTGAA GAATGGATTT GCACAATGGT TAATTCAATG CGTATTGGCT TACCTGCTTC 1680 

GCTTGATAAA GTTGGAGAAG TTTTAAGACT ACAAAGCCAA AAAGATAAAG CAGGTAAAAA 1740 

TTTAATTCGT TATTTCTCTA TACCTTGTAA ACCAACAAAA GTTAATGGAG GAAGAACrAG 1800 

15 

AAACCTACCT GAACATGATC TTGAAAAAtG GCAACAATTT ATAGATTaCT GTATTCGAGA 1860 

TGTAGAAGTA GAAATGGCGA TTGCT 1885 
20 (2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2656 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
25 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

TAATCCTTAG TTCACTGnCA AATTTCAAAA CACCAGTTCC CTCTATCTGC ATCCATAGAA 60 

ACTGnATGTT TGTGTCAATA ACCGGATTAT ATTGTGATGn TGTTTGTAAC TCGATTAAGT 120 

TATCATCTTT CGAAAAATTA TCTACTACCA TTATTCAACC ACCTTTCCTT CGAATAAACT 180 

CCATTTACCA ACkCCACCAG TACCAAAGTT TCTAACTAAA AATTGATGTG CAGACGGGAA 240 

GTTATTACGT CTTAATACTT GTGTTGTATT ACCTGGTGTA TTCGATTTTA CTTCTAATAT 300 

CCAACCTGCA ATACCTTTAA AGTCTTTAGG AAAATCAGTA AATCGGTTTG ATTCTTCAGT 360 

AGTGATATAG AAATCTAAAC CAACGATTTT TAAATCTGAT AATTTTGTAA TACTCTTAGG 420 

GATATGTTCC CAATAACCGG CGTTTTGCGG GCAGAAATTC CATGCTCCGT TGTTTTTCTT 4 80 

45 ATTGAAAATG TCAATGACAC GTTCGAATTT AAGCATATTT CTACCTGTGC TGTTTCTGGt 540 

AAGTACTTGT CTTAGAGCAC CATTATAGTG TCCAGGCAGT ACATCCAAGA ACCACCCTGC 600 

ATCTCTAAAC GCTTTCGGTA ACGGGAAATC TAATGCATTT TGTGTGTCTT GaCGTATAGA 660 

TATAGTAATG ACCAACTTCC GTAATATCAC TTAGATATGC TGGGTTCTGT ATTGGTAACG 720 

GTTTAACACG TCCGCCTGAA TCAGTCATTG ATACTTGAGG TGCGATGTTT TTCAAGAATT 780 
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TAGTTACC CC 


GATTAGAAGT 


GCTTTACGTC 


CTGTTTCTAG 


ATCGTAATAC 


ATATCTAGAC 


900 




CCTCAGCCTC 


TTGGAAATCT 


CCTTTAAAGT 


TGTTATTCAC 


ACCGCCTATA 


TCGATGCGAC 


960 


5 


GTTTAAATAA 


CAATTCTTTC 


GTTTTGATAT 


CGAAGCCTTG 


TAAGTAGTTA 


GGGTTGGCTG 


1020 




TATTCGAATC 


ACCTGTATAC 


CAATATAAGA 


TACCTGCATC 


ATAAGTGATA CCTTGCATAG 


1080 


10 


GTTGTGTATC 


TGAAGTGTAT 


TCCATAGGTA 


TATCCATTTG 


ATACAATACT 


TTGTCTATAC 


1140 


CTTTATCAAT 


ATCGTCAGCA 


CTTCTAACCT 


CAACAAAGTT 


CAACGAATTC 


TTAAGTTGTC 


1200 




TTTCAGTGGG 


TTTATATTCA 


CGTCTAAAAA 


TCATTAAATT 


TTCTACCGGA TTATAAATCG 


1260 


15 


CTGACGTATA 


TCTGTCGTTA 


AATATATTCG 


GCATGACATC 


TTGCATTTCA 


TTACCATAAG 


1320 


TTATTTCTCC 


AGTTCTATAT 


TGGAAACGTA 


CAAACTTGTT 


GTTTTTGTTA 


CTGTCCAATA 


1380 




CAGCTGAATA AATCCATAAT 


TCTCCATCAA 


TGTATCTATA 


CGCATTGTGT 


GTACCGTGAC 


1440 


20 


CGCCGTTTTT 


AACAAGCAAT 


CTATCAATAA 


ATTGTCCGTT 


GGGCTTCAAT 


CTAGATAACA 


1500 




TGTAATGATT ACCTGGACGA GCTTGCGTCA TATAAATAAT TTTCGTTCTA GGGTCTACCC 


1560 




AAAATGATTG 


CATTACTGCA 


TTTGTATATG 


GCGATAAATC 


AGTGATAAAT 


TCCGGTTCTT 


1620 


25 


GCTCTTTTGG 


TTCGAATCGG 


TATTCTGTCG 


CTCGATATTC 


TTTATAGTGT 


TCATCTACAG 


1680 




CTTTCTCAAC 


CTTTTTAGTG 


AAAACATCTA 


GTGTTGAATA 


ATCATGATAC 


AAACGATCTT 


1740 




GCAATGTCTT 


ATGACCATAA 


CCTGTATTAT 


CAACGCGCGC 


GTCTTTTAcT 


TCGTTGATAC 


1800 


30 


CGTCGCCGTT 


ATGACCTAGT 


ACCATGTTGC 


TAAATCGACC 


GTTTAAATAT 


GTTAAAAAGT 


1860 




CAGAGACGTT 


ACTTGTAACA 


TTTAAATGTT 


CATACTTTAT 


TTGTTCTCCA 


TCATGTGCGA 


1920 


35 


ATACCTCTTT 


ATTTCTGTGG 


TATTCAAGAG 


AGAAATTAAA 


ATCCGTCAGC 


ATGTCTGAAA 


1980 


TAAGTTTAAA 


GTTATACTCA 


TTTTCATCTA 


CATATCTGTA 


GTCAAAGACT 


CTACTTAAAT 


2040 




CTGTAATTAG 

• 


TTTATTACTC 


ATGTTTTCCT 


CCTTTACTAT 


CCATAAAACT 


GATmATAATT 


2100 


Aft 


* 

TTTAATAAGC TCATACATAA TAACTTCATG ACCTCTTTCA TTAGGATGTA ATCCATCAGG 


2160 


CATGCTAGAT 


TTTCTAAATG 


CTGGATTATA 


TGGTTTGAAA 


TAATCTGTGT 


GATAAGCATC 


2220 




ATATACTGGT 


ACATCCAATT 


CACTACAAGC 


CAATATCTGA 


GCATTGACAT 


AATCCTCTAA 


2280 


AS 


AGTTAACCCT 


AGTTTGTTTT 


TGTCCGTATC 


TTTACGGCGT 


ATCGTTGTAC 


CACTCATAGG 


2340 




GCATTGCCTA 


GTAGCTGTCA 


TTACAAGTAT 


TTTTGAAGCT 


GGATTATTTT 


TCCTGATAAC 


2400 




TTCAATTGCA 


GAACAAAAGG 


CGCCGTAAAA 


CGTTTTAG7G 


TCGGTTTTAT 


CAGTGCCTAT 


2460 


50 


CGGTACGCCT 


GCCCAATAAC 


CATGTAACCA GTCATCATCT GTACCTTGTA ATATGATTAG 


2520 




GTCTCCTCTT ATTTGCTCTG CTTGTCTaTA AATGCTGTTT TCTaCCGCTT 


CTTTACCTAT 


2580 
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CTTGCCTAAC ATTTCT 2656 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4854 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE SS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

AAAATGAGGG TTCTAGCGGA AATTACCAAA AGCGTGGTTC ATACTATGGG CAGCGTAATC 60 

GTATTTCAAA AGAAAAAACA CCTAAATGGT TAGaAAATAG AGATAAACCT AGTGAAGAAG 120 

ATTCGGCTAA AGATAATAGC GTAGATGATC AACAATTAGA GCAAGATCGA CAAGCATTTC 180 

20 TAGATAAATT ATCTAAAAAA TGGGAGGAGG ACAGTCAATA ATGAAGCAAT TTAAAAGTAT 240 

AATTAACACG TCGCAGGACT TTGAAAAAAG AATAGAAAAG ATAAAnCAGA AGTAATCAAT 300 

GACCCAGATG TTAAGCAATT TTTGGAAGCG CATCGAGCTG AATTmACGAA TGCTATGATT 360 

25 GATGAAGACT TAAATGTGTT ACAAGAGTAT AAAGATCAAC AAAAACATTA TGACGGTCAT 420 

AAATTTGCTG ATTGTCCAAA TTTCGTAAAG GGGCATGTGC CTGAGTTATA TGTTGATAAT 4 80 

AACCGAATTA AAATACGCTA TTTACAATGC CCATGTAAAA TCAAGTACGA CGAAGAACGC 540 

TTTGAAGCTG AGCTAATTAC ATCTCATCAT ATGCAACGAG ATACTTTAAA TGCCAAATTG 600 

AAAGATATTT ATATGAATCA TCGAGACCGT CTTGATGTAG CTATGGCAGC AGATGATATT 660 

TGTACAGCAA TAACTAATGG GGAACAAGTG AAAGGCCTTT ACCTTTATGG TCCATTTGGG 720 

ACAGGTAAAT CTTTTATTCT AGGTGCAATT GCGAATCAGC TCAAATCTAA GAAGGTACGT 780 

TCGACAATTA TTTATTTACC GGAATTTATT AGAACATTAA AAGGTGGCTT TAAAGATGGT 840 

TCTTTTGAAA AGAAATTACA TCGCGTAAGA GAAGCAAACA TTTTAATGCT TGATGATATT 900 

GGGGCTGAAG AAGTGACTCC ATGGGTGAGA GATGAGGTAA TTGGACCTTT GCTACATTAT 950 

CGAATGGTTC ATGAATTACC AACATTCTTT AGTTCTAATT TTGACTATAG TGAATTGGAA 1020 

45 CATCATTTAG CGATGACTCG TGATGGTGAA GAGAAGACTA AAGCAGCACG TATTATTGAA 1080 

CGTGTCAAAT CTTTGTCAAC ACCATACTTT TTATCAGGAG AAAATTTCAG AAACAATTGA 1140 

ATTTTAAAAT GATTGGTGTA TAATGAATAC AAATCTAAAT CGTTTAAATG ATTGAAGACA 1200 

AGATGATCTA ATCAATATTA CACAGAAAGC CATTGTTTGA TGAGAATATG GTTAATAAAT 1260 

TAGATGATTA CTACTTCATT TATGGTATTT GTAATGAATA CCCGGATCAA GACCGTTATC 1320 
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CTCGTCCCTT GTATAGGGGC GGGATTTTTT GTTTTTTTCA GACATAAATG TTTGTTGGTG 144 0 

TCATAAATTC CCTGTTTATT GTTAATAGGT TTAATGTTAA AACGATGATT GTTGTTCAAT 1500 

TTTTTAACGA GGTCAGATAA AAGTATTTAT AAAGCAAATA GGAGGGTTTA ACATGGAACA 1560 

AATTAATATT CAATTTCCAG ATGGTAATAA AAAGGCGTTT GATAAAGGTA CTACTACTGA 1620 

AGATATAGCA CAATCAATTA GTCCTGGATT ACGTAAAAAA GCTGTTGCCG GCAAATTTAA 1680 

CGGGCAACTT GTAGATTTAA CTAAACCGCT TGAAACTGAT GG AT CAATTG AAATTGTGAC 1740 

ACCAGGTAGT GAAGAagcGT TAGAGGTATT ACGTCATTCT ACTGCACATT TAATGGCACA 1800 

CGCGATTAAA AGGTTATATG GTAATGTTAA ATTTGGTGTA GGTCCTGTAA TAGAAGGTGG 1860 

ATTCTACTAT GACTTCGACA TTGACCAAAA CATCTCATCT GATGACTTTG AACAAATTGA 1920 

AAAAACAATG AAACAAATCG TTAACGAAAA TATGAAAATC GAACGAAAAG TGGTTTCACG 1*30 

20 AGATGAAGTG AAAGAGTTAT TCAGCAATGA TGAATACAAA TTAGAATTAA TCGACGCGAT 2040 

TCCTGAAGAT GAAAATGTAA CATTATATAG TCAAGGTGAT TTTACTGATT TATGTCGTGG 2100 

AGTTCACGTT CCATCAACAG CTAAAATTAA AGAGTTTAAA CTATTATCTA CAGCAGGTGC 2160 

25 ATACTGGCGT GGAGATAGTA ACAACAAAAT GTTACAACGT ATATACGGTA CTGCTTTCTT 2220 

TGATAAAAAA GAATTGAAAG CACATTTACA AATGTTAGAA GAGCGTAAAG AACGTGATCA 22 80 

TCGTAAAATT GGTAAAGAGT TAGAACTATT CACAAATAGC CAATTAGTTG GTGCTGGTTT 2340 

GCCATTATGG TTACCTAACG GTGCAACAAT TAGACGTGAA ATTGAACGTT ACATTGTTGA 24 00 

TAAAGAAGTT AGCATGGGAT ATGACCACGT TTATACACCA GTACTTGCTA ATGTTGATTT 24 60 

ATACAAAACA TCTGGTCACT GGGATCACTA TCAAGAAGAT ATGTTCCCAC CAATGCAGTT 252 0 

AGATGAAACT GAATCTATGG TATTACGTCC AATGAACTGT CCACATCATA TGATGATTTA 2530 

TGCQAATAAA CCACATTCAT ATCGTGAATT ACCTATCCGT ATCGCTGAGC TAGGAACGAT 2640 

GCATAGATAT GAAGCAAGTG GTGCTGTATC AGGATTACAA CGTGTTCGTG GTATGACTTT 2700 

AAATGATTCA CATATCTTTG TTCGACCTGA TCAAATTAAA GAAGAATTCA AACGCGTTGT 2760 

AAACATGATT ATTGATGTGT ATAAAGACTT TGGTTTCGAG GATTATAGCT TTAGATTAAG 2820 

45 TTATAGAGAC CCTGAAGATA AAGAAAAGTA CTTTGATGAT GATGATATGT GGAATAAAGC 28 80 

TGAAAATATG CTTAAAGAGG CAGCGGATGA GCTTGGCTTA TCGTACGAnG AAgCGATTGG 2940 

TGAAgCGGCA TTCTATGGTC CGAAACTAGA TGTTCAAGTT AAAACAGCGA TGGGTAAAGA 3000 

AGAGACATTA TCAACAGCAC AACTTGATTT CTTATTACCA GAACGTTTTG ATTTAACTTA 3060 

TATTGGTCAA GATGGTGAAC ATCATCGTCC AGTTGTTATT CATCGTGGTG TTGTATCAAC 3120 
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10 



15 



AGCGCCAAAA CAAGTTCAAA TCATTCCAGT TAACGTTGAT TTACATTATG ATTATGCGCG 324 0 

CCAATTACAA GATGAATTGA AATCTCAAGG CGTTCGTGTA AGTATTGATG ACCGTAATGA 3300 

AAAAATGGGT TATAAAATCA GAGAAGCTCA AATGCAAAAA ATACCTTATC AAATCGTAGT 33 60 

TGGGGATAAG GAAGTTGAAA ATAATCAAGT GAATGTGCGT CAATATGGAT CGCAAGACCA 3420 

AGAAACAGTT GAAAAAGATG AATTTATCTG GAATCTAGTT GATGAAATTC GTTTGAAAAA 3480 

ACATAGATAG ACAGTTGTCG CAATAAAATG CTTTAAAACT TTTATTGCGT ATCAAGTTTT 3540 

ACAGGGTTGA TTATGCGTGA TGAATCCTGT ATATTACAAG TTAGTTAAAA TA7TAAATTG 36 00 

AGTTAGAGGT TGCATGTTTA ATTAGTAACT TGTCAGAAGT ATTTATGGTA CATAAGTTGA 3660 

ACAAGTGAAA GGTAAAGATG CCGAAATAGA TATAAACCAT AAATTATATC TATTGGGACA 3720 

GTTTTCGAAT AGGAACTGTA CTGTCACAGA ATGTGATGTG CTAC CTTATA TAGATAATTG 3780 

20 CCAAAGTGGT TGCATATCTT AAAGGTATGT AGCCACTTTT TTACTTTTAA TATCACTATG 3840 

TTCTGTAAAA AAGGGTATGA AAGTGAATAA AGGTTATTTA TTTCTTGGCC TCTAAAACAT 3900 

GGAAAGGGAG CTTATATGTC AAAAGTTCAA AATGAAAGTA ACAATGTTGT CAAAAGGGGA 3960 

25 CTTAAAGATC GTCATATTTC TATGATTGCG ATTGGGGGTT GTATTGGTAC AGGTTTATTT 4 020 

GTAACTTCTG GTGGAGCAAT TCATGATGCA GGTGCTTTGG GTGCATTAAT AGGATACGCA 4 080 

ATTATCGGAA TAATGGTATT TTTCTTAATG ACGTCACTTG GCGAAATGGC TACGTATTTG 414 0 

CCAGTATCAG GTTCATTTAG TACATATGCT ACAAGATTTG TTGATCCATC TTTAGGGTTT 4200 

GCGCTTGGTT GGAACTATTG GTTTAACTGG GTAGTGACTG TAG CAGCAGA TATTACGATT 4250 

GCAGCACAAG TCATTCAATA TTGGACACCA TTGCAAGGCA TACCCGCTTG GGCATGGAGT 4320 

GCGTTGTTCT TAGTTATAAT TTTTAGTCTG AATTCGTTAT CAGTTCGCGT CTATGGTGAA 43 80 

AGTCjAATACT GGTTGGCATT GATAAAAGTG GTTACAGTTA TTGTTTTCAT TGCAATTGGT 444 0 

TTATTAACGA TTGTCGGAAT CATGGGTGGT CATGTTGTAG GATTCGAAAT ATTTAATAAA 4500 

GGTGAAGGTC CAATTCTTGG TGGCAACTTA GGAGGAAGTT TGTTATCAAT TCTAGGTGTA 4560 

TTCTTAATCG CTGGTTTCTC ATTCCAAGGT ACTGAGTTAA TTGGTATTAC GGCTGGTGAA 4620 

45 TCAGAAAATC CTGAACGTGC TGTGCCGAAA GCAATTAAAC AAGTATTCTG GAGAATTTTA 4680 

TTATTTTACA TTTTAGCCAT TTTTGTTATC GGTATGTTAA TTCCTTATGA TAGTAGTGCA 4740 

TTAATGGGGG GTAGTGATAA TGTAGCAACG TCTCCATTCA CATTAGTGTT TAAAAATGCT 4800 

50 GGATTTGCGT TTGCAGCATC ATTTATGAAT GCAGTCATTT TAACGTCTGT GTTA 4854 
(2) INFORMATION FOR SEQ ID NO: 107: 
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(A) LENGTH: 24 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

t 0 ATCAAAAATT GATTGTTTTC nATTTTTTGT TTCAGCGCGG GATCTTTTAC GTCTTTTGTG 60 

AAAACGaTTT TATTATTAAC TACTTTTACT GGATAACTTT TGTATGTCGA GTCAGTAGCA 120 

TTTTTTCTAT CGTTTGTAGT TGTGTCATAT TCACCAgTTA TTTTATGTGT GTTCTTATCT 180 

15 ACCTTTAACA ACATACGGTC TTCTTTTAAA AGCTCATCTG ATCCAACAAC TGAATAAGAG 240 

GATTCTATAT ACCATGTGTC TTGATCATTA TTTTCATAAT GGGGATTATC GTGACCATCA 300 

ATTTCATAAA GCGTTTCTAA GTTTTTAATA GGATACGTAC TTAGTACTTT TTTAAGACCA 360 

TCTTTCAAAT GAATTTGTTC CCACTTCATT GCCAAAAACA TATCGCCACT GACTACAATT 420 

GAAATAATAA TAATTGCTGC TAAGTTTAAC CAGAAAATTT TATGTGCTTT CATACATTCC 480 

CACCGTTTCT CAAAATACTT CATTAACACT ATAATAATAT ATTTTGAAAA ATATTTACAT 540 

CAGTATTAAA GTGAATATCA AATTTTAAAT TTATGAAAAT AATAGATATT TATAAAAAGC 600 

GGAAAAGAGA TACAATAAAA AACTGCATGA CGTTTGAGAC GTCACACAGT GTAACTAAAA 660 

ATTTAAAAAG TTGTTGCTAA TTTTTCAGCA TTATTAATAC TAGTTGCTTT AATTTCTTCA 720 

GTCTTATGAG GTTCAGCATT GTGTCCTTCA ATAATGATTG TTTCATATGA TGGCACACCT 780 

AAGAATGTCA TAATTGTTCT TAAATAACGG TCACCCATTT CAAAATCAGC AGCAGGTCCT 84 0 

35 TCAGTATAAT ATCCACCACG TGATTGAATG TGTAATACTT TTTTGTCAGT TAGTAAACCT 900 

TGTGGTCCTT CAGCAGAATA TTTAAAAGTT TTACCTGCAA TTGAAATAGC ATCAATATAT 960 

GCTTTAACTA CAGGTGGGAA AGAAAGGTTC CACATAGGCG TTACAAATAC ATATTTATCT 102 0 

40 GCACTTAAAA ATTCTTCTAA AATGTCACTC AATCTTGAAA CTTTCATTTG TTCATCATCA 108 0 

GTTAACGTTT CGCCATTACT CATTTTTCCC CAACCAGTTA ATACATCTTT GTCAATAACT 114 0 

GGAATATAAG TTTCArATAA ATCAATATGT TTCACTTCAT CATCAGGATG TTGTTGTTGA 1200 

TATGTTTCGA TAAATGCTTT ACCAGCCGCC ATAGAATTTG ATACCAGTTC ATTAAAAGGG 1260 

TGTGCTGTAA TATATAATAC TTTTGCCATT TGAAAATTCT CCTCTGkTTC TGTTATTTTC 1320 

TTAAGTATAA TTATTATACT CGATATAAAA TTTAATATCA ATCAAAATAT TCAAATTACC 13 80 

ATCATTTTCT TCATCTATAT nTGGCAGTAC TACTAAAGTA TGAGTGCATT TAATTATGAa 1440 

ATAGTTGATT TaGAATAcAT ACTTAATACC CAAAATATAT GAAGGATGGA TGCCACTATG 1500 
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ATTATTTATA TAGATGACAT TCAAAAATGG TTTAACCAAT ATACCGATAA ATTGACACAA 1620 

AATCATAAAG GACAAGGACA CTCAAAATGG GAAGACTTTT TTAGAGGGAG TCGGATTACT 1680 

GAGACTTTTG GTAAATATCA ACATTCACCA TTTGATGGTA AGCATTATGG CATTGATTTT 1740 

GCATTGCCAA AAGGTACACC AATTAAAGCG CCGACGAATG GTAAAGTAAC ACGTATCTTT 1800 

AATAATGAAT TGGGCGGCAA GGTATTACAG ATTGCCGAAG ACAATGGAGA ATATCACCAG I860 

TGGTATCTAC ACTTAGACAA ATATAATGTC AAAGTAGGTG ATCGAGTCAA AGCAGGTGAT 1920 

ATTATTGCAT ATTCAGGCAA TACAGGTATA CAAACGACAG GCGCACATTT ACATTTTCAA 1980 

1$ AGAATGAAGG GTGGCGTAGG TAATGCATAT GCAGAAGATC CAAAACCGTT TATCGATCAG 204 0 

TTACCTGATG GGGAACGTAG CCTATATGAT TTGTAGTTAT AGAAGGGTGC CCGCAGTCTA 2100 

AAAAATTAAG CAATCATTGT GTGAGTATGA TACTTACATA ATGGTTGCTT TTTTCAATGA 2160 

AAATCGTAAT GCTAAGTCAT ACTTGTTTGA TTTAGATATT ACTTAAAATG TAAGACAAGG 2220 

TTGTTAGCAT TGGCAGTGAA ATATCGCACA TAAAAAACAT TATTGTCACA CTAGAAAATA 22 BO 

GTTGTGCACT ATATCAATTT TCTGTATAAA AGTTTAATTC TGACAGTAAT GTAAACGTTT 234 0 

ACAATTTATG ATTGACATTA ATAATGACTG AATATATGAT TTATGTAAGT ATTTGTGCAA 24 00 

CGTTTTCACA AAGTGTATTG CACaAyCAAA CTGtAAACaA aGTATGGGGg GCCATAACAT 2460 

GGCAGAACTA AGTTAGAGCn TATTAAAA 24 88 
(2) INFORMATION FOR SEQ ID NO: 108: 
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(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 4093 base pairs 
3S (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

TTTTCTTTAT TTCAAmCTGT ATATTaATGA TGTCACTTCA TTTGATACGA TTCTTGATAA 60 

CCTATTCAAA ATTCCGCCAA ATAACATAAA TATTATATAA ATGCCGATAC TTTTAATCAT 120 

TTTCTACTTT TTCTTCGATA CGGAAACTTG TTTTCGAATT GAACACTTCA CCAGCTTTTA 18 0 

AAATTGACGG TGCTTTTTCA CCATATAAAT TAATATCATT TGGTAAAAAT TGTGTTTCTA 240 

SO ATGTAAAGCC AGAATGTGGT TTATAAATAT TAAATGGACT ATCCCACTCA TCAGGCTGGT 3 00 

TAAAAGTAAA GAACACAACA TGAGGCATAT CTGTATCGAC CTCTAACATA AATTCATGAT 360 

TTTCAACATA CATTTTATGT TCACCAACTG TAAATGGGTG ATCGAGACCA CCAAAACGTG 420 
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TATCTTCAAA CACTTCATGT AAATCTAGAA TATCACCTGT AACAATATTT CGCTCATCTA 540 

ATACATACAT ATCTAATTGA TTACTTGAAA TGCGATGATT ATCAACGACA TTATTATCTC 600 

GATTCAAATT GAAGTACACA TGATTCGTAG GACTAAACAA TGTGTCTTCT GATGCAACTG 660 

CTTCGTATTC AATCGACCAT TGGTGATCCG CATCATAAAT ATGTGTAATC GTCACATCGA 720 

TATCACCCGG GAAATGATCA TCAGCTGATT TCAACACCGT CTTAAATATA ACTTTAATTT 780 

GAGCAATTTC ATTTCTAATT TCATAATCAA ATAACTTATT GTCCAAACCA TGACATCCAC 840 

CATGTAAATG ATGTTCACCG TTGTTTTTTT CTAACTGATA TTCTTTACCT TTCAACTTAA 900 

15 ATTTAGCATT ATCAATTCTA CCGCTATATC TTCCTATAGA AGCACCAAAT TTAAAAGGAT 960 

TACTATGATa AAATTCATCC GCTTCAACAA CATTTCCAAG AACAATATTA TTATCATGAT 1020 

ATTTCCAAGA CACTACTCTT GCTCCATAAT TCGTAAAAAT AATTTTAGTT TCATCATTAT 1080 

CAATTTTGAT TAAATCTACA CCTTGTCTTT GGTGCTCAAC TTCAACTATC ATTTTTACTT 1140 

CTCCCTTCTA ACCACAAGTG TTCAAGCTCT GCTGGGTAGC AACATTACTA AAACACCTAC 1200 

AATACAAATG ATTGCACCGA TAACATCATA TTTATCTGGC ATTTGTTTAT CTACGACCAT 1260 

CG CAAAAATC AAACTCATGA TGATAAATAC GCCACCATAT GCTGCATATA CTCTTCCGAA 1320 

TGATGGAAAT GATTGAAATG TCGCAATGAC ACCATATAAC ATGAGTATCG CACCGCCTAT 1380 

TAGCCCAACA AGTGAAGACT GTCCTTCCCT AAGCCACAGC CAAATCAGGT ATCCCCCACC 1440 

TATTTCACAT AAGCCAGCTA ATATAAATAT AAAAATCGGA TATAACATGA AATCACTCCA 1500 

TCACACATTT GCTATCAATA ATCTATCGGC TACATATCAT TTGTTTACAT TTCTTCTTAC 1560 

35 TTCACATTCC CATTTTAAAA AGTTCGTTTT CACATTCATA TTGTACACTT TTTTAGACAT 1620 

TATTCTATAG CTAAATATAA AAAAATAAGA GTAACACGCT TTCATCATCA TTTTATATGA 1680 

TAAATGTGTG TCACTCTCAT CAATTTTATT TTTTAAATAC ACGTTTCATT GAATTAAATA 1740 

AGCCACGTTC AAATGTAAGT ACTGAATCTT TATATGTTTT AATTGCAATC CATATCAAGA 1800 

CAGCTACCAT TACAATTGAG ATTAAAGAAC TTAAGATGAC CTCATATATT TGAAGCCCTG 1860 

AAGTTTGAGC GCGTACAACT AATTGAAATG GCGCTAAAAA CGGAATATAA CTTGTGATTA 1920 

AAGCAAGTTG TCCATCAGGA TTATTTATCG TGAATATCGC GATATAAAAT GCAATCATAC 1980 

CAAGTAATGT CAGTGGCATC AAAGATTGAT TTAAATCTTC TATTCTAGAT GTTAATGATC 2040 

SO CGAGGATGGC TGCAAGTAAT ACATACGCCG TAATTCCAAC AATACTACTT ATAATTCCGA 2100 

CAATAATAAT TTGCCAAGAC AATTGATTCA TTTCCACGTT AAAACCTTGT AGCAAGTCTT 2160 

TTAAGTCAAA GGCAAAAATG CATATAACTG CCATCAATAC AATTAAAATA ATCTGAGTCA 2220 
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CTTnCCGGTG TTT 4 0 93 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 17846 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

75 TGCCAAACTA CCTTTTGACA GTCGTTGCTG TACTTCAGGA TGATCAATCA CATATnTTAC 60 

TTTATCAAAT AGGGCATCTT CATCATTTTT AGTAATTAAA TAACCATTGA AATCTGAAGT 120 

AATCAGTTCG TTAGGTCCAT ATTTAATATC ATAACTAATA ACTGGAACAC CATGTGCTAA 180 

AGATTCAAGT AGCGCTAAAG AGAAACCTTC CATGTTACTT GTTATTAAAC TCAAATAGGC 24 0 

ATCGCTATAT TCTTGGTCTA GATTGCTTAA AAAGCCGCGT AAGTAAACAT GATTTTCCAA 300 

TCCATATTTT TGTATCAATT CATTTAATTT TTTACTTTCA GAaCCAAAAC CATACATATG 360 

AaGCTCTATT TTTGGGACAT ACGATACTAA GCGTTTAATT AATTCAATTT GTTGATGTAA 420 

TTGTTTTTCA GGTGAATAAC GAGCAACGGA AATTAATTTA ACACTGCGCT GATCTAATGT 480 

TTGGACTGGT GTATCAATTG TTTCACTATA GCCGACAGGA ATATTAACAA CTGGAATAGT 540 

ATGGTTAATA CGTTTTTCAA CATCTAATTT TTGCTGCTCA GTAGAAACGA TAATTGCACG 600 

ATATCGAGAT AAATTTTCAA ACATCGCTTT ATATACATTT TTAAATGGCG ATGAATCTAA 660 

35 TGCATCAATA TTTTTAATGT GTGTACTGTG AAGCACAGCT ACTACTGGGA TTGACTCAGG 720 

CGTTAAGTTG AAAATAGGTG CTGTGTACAC ATTACGATCA CTGAAAAATA AATCCCCATG 730 

TTGATATAGT TGTTTAATGA AAAATGCGCC TAATTCCGTT TCATTATTAA AGAAATATTG 84 0 

TTTGTTAGCA TAGTAAACAA TAATTTTTTG TACTTCTGGT TTGCCATCCT TGTAAGAAAA 900 

ATACTTTTCT AATTTTGTGT CACCTTCTGG ATTATAGAAA AATTCACATA ATGTTTGTTG 960 

TTTATCAACA AGAATCCTAC TACAACTTAA AAAGCCACGC ACATCATAAA AATCACGTTT 1020 

TACTTtTCGT CTTTGACTAT CAAAATGATT TACATAATCT AATATACGAT ATTTAGGATC 1080 

TTGAAAATGG GCATACATTA AGAAACGCTC TTGATCATAT ATTCTAAAGT CATGACTATT 1140 

SO TTCAACATGT TTTAAAGTAT AATGACATTC ATCAGTCCAA TACGACAACC AGTCAAATGG 1200 

TTCATTGCGT TCTAAATATG TTGCTTCTTG GAAGAAATCA TACATATTAA TATAGTCAGA 1260 

ACTAGTAATA TAATTTTGGG CATTTCTATA TAAATATCTA TTCCATGACA GAAATACACA 1320 
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CCCAGTTAAA TTAACACCTA AACTATTACC 
TATCTATTTT TTATAATTAT ATCACATAAT 
5 GTTTATTTAT AAAACAACAA ATTTTGATAT 

GACCcAATAT ATCATkGTAG AGCTTAGGAT 
GAgGATTTAT AAArGAGATA TACAACTCTA 

10 

TACGCGTTTG AATTAATCAT ATGATATTAT 
TGTTGATGGA TAGACTCTTC ATTACGACAT 
TCAACTAAGT CCGCGGCAAT TAAACGAATT 

15 

GCACGCTCTG TCACACGTTT TTCATCAGGA 
GCATTAGCTT CCATTGCATC TAAAATTTTA 

20 TTATTAATAT ATGTTTTAAC AGTGACAACA 

ATTTCTTTTG AACAAGGGCA TAGCGTTGTG 
GTAnCTTTAT CACCGTCAAT TGCTAATCCA 

25 ATATTTGTGG TTGGACTATA GCGATCAAAG 
GCATTTTGTT TCATATTCGT TTGTAAAGTG 
AGTTCAATAC CATTATCATA GTGCTTTTCA 

30 

CCTTTTTCGT CTTTTGTTAA ACTTGTTGAA 
TCAACAAGTA CAGGGTACAC TAAGTTTTTA 

35 TCTTTATGTG TACTTTGTAA ATCTGTCATT 
ATAGGATCTA CGGAACCAAA GTGTTTCCAA 
GTCATTTTTT TCCTCCGTTA AGATTTAAAG 

4 0 AAGCTGTGTT GTTTACCATC GATTTCAGGA 
TGAGAAGCAT GTGCTTCAAA TGCCTTAATT 

TGAATATCAG GTTCTCCAAG AGCTTCGGTT 

45 

CGAGGGCGTT CTTCTTTAGG CATGCGTTCA 

TCGTGATCAG GATGTACTGC ATATCCAGGA 

TCATCGATTA AAGATTTAAT CAT AC CAT CT 

so 

TTGTCACGTA AACCCATTTT TCTTAAATCA 
AGTTCACGCT CACGAATACT TGGTAATGAT 

55 



TACAAAATAA TTCATTTACA ACACCACTTA 1440 

ATTTAATTAC TTCTTTTAAC TGGAAGATGT 1500 

TTATAATGAT AGTAGTTATT CAATCACTAC 1560 

ATTGATTTAT GACTCAGGCA CATCAAATGa 1620 

GAAGGTATAA TAAAAACGCG CAACTAATGT 1680 

TTGCGATACT TTAATTTAGC GAAAgcATCA 1740 

TCGATATCGA AACCGTCTAA CCAATCAAAT 1800 

AAGTCTTCGA CAAAACGTGG ATTTTCATAT I860 

CGTTTTAAAA TAGGGTATAG AATTGAACTT 192 0 

TTTTTATAGT CATCAACTAT GTCTTGATCT 1980 

CCACGTTGGT TGTGCGCTGA ATACTCACTT 2040 

ACAGTTGCTT CAATAGTAAG TTCTTTACGT 2100 

TAAGTGACAT CGGCATTACC AACTGCTTTA 2160 

AACCATTTCC CAGAAACATC AACGCCTGCC 222 0 

CGTAACACCT GATAAAGTGT ATTAAATTCA 2280 

ACACTTTCGA TTATACGGCT CATATTAATA 2340 

AAACTAAATG TGCCAGCTGT TTGATACTGG 2400 

ATACCAACTT CTTCTATTTC AAATAAAAAA 2460 

TCGTTCTTAG TAGTAGGTTT CGTGCCTTCA 2520 

CGACCTTCTC GTGTCGATAA ATCAAATTCA 2580 

TGATATGTCC AATATGGTTC GACTGTTAAA 2640 

CTTGCTAATT GTTTTAAAAA TGGACCTGTT 2700 

TTAAGTTCTT TAAAATCTGT AATATCATTT 2760 

GCATCATTAC TGAACGCAAC TAAAGTTAAA 2820 

ACCGTTCGAA TTACAGCGTC TGCTGTTGCT 2880 

TAAAATGAAA TAATCAATGA TGGATTTGTA 2940 

ATATGTTCAT AGGGTTCAAA TTCGACAGTT 3000 

GTAATACCGA TAACTTTACA AGCTTCTTCT 30S0 

TCGCGTGTTG CAAATGGGGG ATT AC CT AAA 3120 
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TAATTTGCTA ATGTGCCTGC AGATGAGAAG 
AATACATGTC TTTCGTCAGT CATGTTGATG 

5 

AATTTGAAGT GCTGCAGCGA GTTGACCTTC 
ATGCTCATTG ACCTCAAAAT GCGTTAGACC 
TTTAAGACCA ATGCGATAAG GTTCTTTATT 

10 

TATTTGTATG TTTCTTAAAA AAGTACCAGC 
ATAGGCCCCA TTTGTCGTTT CAACATGCAG 

15 TAAATCTATA ACTTCTTGTT CTTTAATTGG 
TGTGTTTATC TTTCTATTTT ACTAAAAACT 
TTTATAAATT AATTTTCATG AAGGGTAATT 

20 TTTTTTACTT TTAAAAATCA AAAATTTGTT 
GATGCTATAT TAATGGTGTA TGAATGAATT 
GAGGCATGTA AACAATGAAA GTATTAAACT 

25 

TTGCATGTGA GTTATATAAA GAGATGGCAT 
CTGGTGGTAC AATGACAGAT TTGTATGAAC 
TAAACGTAGA CAATGTATCC ACGTTTAATT 

30 

ATCCGCAAAG TTAT CACTAT TATATGGATG 
ATAGAAAGAA CATTCATATT CCAAATGGAG 

3S AAATATAATG ACGTTTTAGA ACAACAAGGT 
GAAAATGGTC ATATTGGATT TAATGAACCT 
GTTGATTTGA CTGAAaGTAC TATTAAGGCT 

40 GTTCCAAAGC AAGCCATTTC GATGGGACTT 
TTACTCGCAT TTGGTGAAAA GAAACGTGCT 
TCTGTTGATG TTCCAGCCAC ATTACTTCAC 

45 

GACGAAGCTT GCCCGAAAAA TGTTGCGAAA 
TG7TTAATTA AGAAATGCCT CGGGAAAGGT 
ATGATTTTTA GTGGAATTAC AATTAGCAAT 

50 

GTTAGCAAAT AAAGTAAAAG ATTATGTAGA 
CAACGaAGGT TTACCAGCAG TTAAACATAT 
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GTTTCATCAT 


CAGGATGTGG 


AAATATTACT 


3240 


CCTCCTCTAT 


AAATTAAATG 


GTCGCTCACT 


3300 


GTAATTAAAA 


CCTGCAATTA AAAATTCATC 


3360 


TTGTACATAA ACCCAACCAC 


CATTTGATAG 


3420 


ACCACCTTTT 


AGTTGTGCAT 


GCGTATATGT 


3480 


ATTAAAAACA 


CGTTGATCGA 


AATGGTTCGC 


3540 


ATACACAGGT 


TTATGTTCAA 


AAGAAGCAAG 


3600 


TTCCAACACG 


TTCACTCCTT 


ACACTATCAA 


3660 


ATTCGATAAT 


TGTATACGAT 


TGCTCAATTA 


3720 


ACTCAGGATT 


ACGTAATCAT 


ACAGCATTAG 


3730 


GGAATTTGAA 


AAGTGTTAAA 


CATTAAAAAT 


3340 


CATAAGTTTT 


TAAAATGTAT 


TAAATTTGTG 


3900 


TAGGATCGAA 


AAAACAAGCA 


TCATTCTATG 


3960 


TTAATCAGCA 


CTGTAAACTA 


GGTTTAGCAA 


4020 


AACTTGTTAA 


GTTGTTAAAT 


AAAAATCAGT 


4080 


TAGACGAATA 


TGTAGGTTTA ACCGCATCAC 


4140 


ACATGCTTTT 


CAAACAATAT 


CCTTATTTTA 


4200 


ATGCCGATGA TATGAATGCG GAAGCGTgCA 


4260 


CAACGTGATA 


TTCAAATTTT 


AGGTATTGGT 


4320 


GGTACGCCGT 


TTGATAGCGT 


TACTCATATC 


4380 


AATAGTCGAT 


ATTTTAAAAA 


CGAaGATGAT 


4440 


GCTAATATTC 


TTCAAGCCAA 


ACGTATCATT 


4500 


GCTATTACAC 


ATTTATTAAA 


TCAGGAAATT 


4560 


AAACACCCGA 


ATGTTGAGAT 


ATATTTAGAC 


4620 


ATTCATGTCG 


ATGAAATGGA 


TTGATTGCAA 


4 fifl ft 
*» a o \j 


TCCAATAGAA AGATAAAAAG 


CATTGGAAGG 


4740 


TGATTTATTA 


AACAAAGAAG 


ACGCGGCTGA 


4800 


TATCGTAGAA 


ATCGGTACGC 


CAATCATTTA 


4860 


GGCAGACAAC 


ATTAGTAATG 


TAAAAGTATT 


4920 
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CGCGGATGTA ATTACAATAC TAGGTGTTGC 
AGAAGCTCAT AAAAATAATA AACAATTACT 
5 AAAACGTGCA AAAGAACTAG ATGAAATGGG 

TGATTTACAA GCAGAAGGGC AATCACCATT 
TAAAAATTCT AAAGTTGCAG TAGCAGGTGG 

10 

CGCTGAAAGT CCTGATCTTG TTATTGTTGG 
AGAAGCTGCG AAACAATGTC GCGCTGCAAT 
ACTATCAATT AATTCTAGAT GAATTAAAGA 

15 

TTTCAACTTT TGCATCCAAA ATACTACATG 
GTTCAGGATT CGTGGCGAAT AGTTTTGCAA 

20 ATGTTGTTGG AGAATCAACG ACACCTG CGA 

CTGGTTCAGG TTCCACGGAA CATTTAAGAT 
CTGACATCGT ATTAATTACT ACAAATAAAG 

25 ACATCGTTTT GCCTGCAGGT ACAAAATATG 

GTTTGTTTGA ACAAGCATCT CAATTATTTT 
AAATGAATGT TACGGAACAA ACGATGCAAC 

30 

ATAGTCGATA ATATGATGCC TAGGCAGAAA 
AAATTATAGT ATAATATCAA TAATAAACGA 

35 TATATTTTTG ATTTTGATGG TACGTTGGCA 

CAAAGTGCAT TTAAAGCATG TGGCTTAACG 
ATGGSAATAC CTATTGAAGA ATCATTTTTA 

40 GCATTAGCAA AGTTAATCGA TACATTTAGA 
ATTTATGAAT TTGCGGGTAT AACTGAAGCC 
CTTTTCGTGG TGTCTAGTAA GAAGAGTGAT 

45 

TTAAATCACT TGATTACCGA AGCTGTTGGA 
CCTGAAGGCA TACACACAAT TGTGCAACGC 
ATTGGTGATT CAACGTTTGA TGTTGAGATG 

50 

GTCACTTGGG GTGCACATGA TGCAAGGTCA 
AA7GATCCAT CAGAAATTAA TACCGTATTA 

55 



AGAAGATGCA 


TCAATTAAAG 


CAGCTATTGA 


5040 


AGTTGATATG 


ATTGCTGTTC 


AAGATTTAGA 


5100 


TGCTGATTAT 


ATTGCAGTAC 


ACACTGGTTA 


5160 


AGAAAGTTTA AGAACCGTTA AATCTGTTAT 


5220 


AATTAAACCA 


GATACAATTA 


AAGATATTGT 


5280 


TGGCGGAATC 


GCAAATGCAG 


ATGATCCAGT 


5340 


CGAAGGTAAG 


TAATATGGCT 


AAATTTAGTG 


5400 


TGACTTTGTC 


ACATGTTGAA 


GCGGATGAGT 


5460 


CTGAACATAT ATTTGTAGCT GGCAAAGGAC 


5520 


TGCGCTTAAA 


TCAGCTCGGC 


AAACAGGCAC 


5580 


TTAAGTCGAA 


TGATGTATTT 


GTAATTATCT 


5640 


TATTAGCAGA 


CAAAGCAAAA 


TCAGTAGGTG 


5700 


ATTCTGCAAT 


AGGCAATCTA 


GCTGGGACGA 


5760 


ATGAACAAGG 


CTCGGCACAA 


CCATTAGGAA 


5820 


TAGATAGTGT 


TGTAATGGGA 


TTGATGACTG 


5880 


AAAATCATGC 


TAATTTAGAA 


TAAAATAAAG 


5940 


TATTATCGAT 


TATTTTTTTA 


TTTAAATAAT 


6000 


ATAGGGGTGT 


TAATATTGAA 


GTTTGACAAT 


6060 


GACACGAAAA 


AATGTGGTGA 


AGTAGCAACA 


6120 


GAACCATCAT 


CTAAAGAAAT 


AACGCATTAT 


6180 


AAATTAGCAG ACCGACCATT 


AGATGAAGCA 


6240 


CATACATATC 


AATCTATTGA 


AAAGGACTAT 


6300 


ATTACAAGTT 


TGTATAACCA AGGGAAAAAA 


6360 


GTATTAGAAA 


GAAATTTATC 


GGCTATTGGA 


6420 


TCCGATCAAG 


TAAGTGCATA 


TAAACCAAAT 


6480 


TACAATTTAA ATAGCCAACA AACGGTGTAT 


6540 


GCACAACGTG 


CTGGTATGCA 


ATCTGCAGCT 


6600 


TTACTTCATT 


CAAATCCGGA 


I"ri u rATTATT 


6660 


TAAAACTTGT 


TAAAACAGAG 


AATACCATGG 


6720 
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ATTTAAAATA AATATTTATT AAACATTATG 
ATAATTTATT TTTGTAAAAA TAAATTAAAG 
5 ATAAACATTT AAATACGATG TCGAAAATGG 

GctATCGCTA TTTTTAGTTA TAATTCCAAA 
TATTGTTTAA TTCAAATGTA TGAGGGTATA 

10 

ATTTTTGAAC AAACATACTT TTGTATTTAT 
AAACTAATTA ACTCCGTATA ATTATGAAAC 

15 TTTAATAAAG AGAATATTAA CATGGTGGAT 

ACCGGTATCG GTAATGCAAT GGAATGGTTC 
TACATTGGAG CGAACTTCTT CTCTCCAGTA 

20 TTCGCAGCAT TAGCCATTGC GTTTTTATTA 

ATTGGTGACA AATATGGACG TAAAGTTGTA 
TCAACATTAA CCATTGGATT ATTGCCAAGC 

25 

CTATTATTGC TTGCAAGAGT ACTACAAGGG 
ATGACATATG TTGCCGAATC ATCTCCAGAT 
GAAATTGGGA CATTATCAGG TTACATAGCT 

30 

TTTTTAACAG ATGAACAAAT GGCATCATTT 
TTCCTAGGAT TATTCGGCTT ATATTTACGT 

35 AATGATGTTG CAACACAACC AGAAAGAGAT 
TATTACAAAG ATATATTTGT ATGTTTTGTA 
ATGGTAACTG CATATTTACC AACCTATTTA 

4° ACAAGTGTAT TAATTACTTG TGTCATGGCA 
AAGTTAGCGG ATAAAATAGG TGAAAAGAAA 
TTATTCAGTA TCATCGCATT TATGTTATTA 

45 

GGTATATTTA TATTAGGATT TTTCTTATCA 
CCAACGATGT TTTACAGTCA TATAAGATAT 
GTTTCGATAT TTGGTGGTaC GaCGCCATTA 

SO 

GATCCATTAG CmCCTGCGTA TTATTTAACA 
ACATTCTTAC ATTTAAGTAC AGCAGGAAAA 
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AATTTXTAAA 


GAGTAATGTC 


TGACTCGTTG 


6840 


TAATGACAAA 


GTTATTGAAG 


TAAATTGAGT 


6900 


CGATAGCATA 


TCACTTACAT 


GAAGTTGTGT 


6960 


AAGTTAATCG 


TTCGATGATT 


TAAGAATTAT 


7020 


AAATCATTGA ATTTAATTCG ATAAAGCGAA 


7080 


ATAAAAGTTT 


AAATTCTTAT 


AAATTTGACA 


7140 


ATACAAGAGG 


GAGTGTATGA ATTCATGGAT 


7200 


GCAAAGAAAG 


CTAAAAAAAC 


CGTTGTTGCA 


7260 


GATTTTGGTG 


TCTATGCATA 


TAcAACTGCG 


7320 


GAGAATGCAG ACATTCGACA AATGTTGACT 


7380 


AGACCAATTG 


GTGGTGTCGT 


ATTTGGTATT 


7440 


TTAACATCTA 


CAATTATTTT 


AATGGCATTT 


7500 


TATGATCAAA 


TTGGACTTTG 


GGCACCAATA 


7560 


TTTTCAACAG 


GTGGAGAGTA 


TGCGGGGGCA 


7620 


AAGCGTCGTA 


ACTCATTAGG 


TAGTGGACTA 


7680 


GCTTCAATTA 


TGATTGCTGT 


ATTAACATTC 


7740 


GGTTGGAGAA 


TCCCATTCTT 


ACTCGGTTTA 


7800 


CG7AAGCTGG 


AAGAATCACC 


AGTTTTCGAA 


7860 


AACATTAACT 


TTTTACAAAT 


CATCAGATTT 


7920 


GCTGTTGTAT 


TCTTCaATGT 


TACAAACTAT 


7980 


GAACAAGTTA 


TTAAATTAGA 


TGCAACGACA 


8040 


ATAATGATTC 


CATTAGCATT 


AATGTTTGGT 


8100 


GTATTTCTAA 


TTGGTACTGG 


TGGGCTAACA 


8160 


CATTCACAAT 


CATTTGTTGT 


AATAGTAATC 


8220 


ACTTACGAAG 


CGACAATGCC 


AGGGTCGTTA 


O O O f\ 


CGAACTTTAT 


CAGTAACATT 


TAATATCTCT 


8340 


GTkGCAmCaT 


GGTTaGTTAC 


GAAAACTGGA 


8400 


GCAATCAGTG 


TTATTGGCTT 


TTTAGTTATT 


8460 


TCTCTAAAAG 


GTTCGTATCC 


AAATGTAGAT 


8520 
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GAACGTAAGA ATTAGAGATT TTAATaAAAA GTATAAATCA ATCGTATATA AGCACTTTAA 


854 0 




AGCTAGTAGG 


TTCTGCTAAC 


TTTAAAGTGC TTTTTAAATT 


GAGAACTGTA 


ATTAGCCGTA 


8700 


5 


ATAAAGTTTT 


TGTATATACA 


TAAACCCCCA 


CTGCAATGAT 


TATCGCAATG 


GGGGAAAGAG 


8760 




GGGACTTAAA 


GCATATGTTT 


AGCTTTGAAT 


ACTTAAAATT 


CTCTTGCTAT 


TGAAATGTTA 


8820 




GGATGTAAAT ATGTCTTAGA GTATTTTGTC 


CAACGCAATT 


AATATTGAGA 


CTCTAACCTT 


8880 


10 


CAATATTATT 


ATAGAGAACA 


CAAACTTAAA 


TAGATTGGGT 


GACTTATTTG 


TGTCAGTTAT 


8940 




TGCGATTGCG 


ATAACTTCTT 


TTCTCTATAT 


ACATATAGTA ACGTCTTATC 


TAATAAAAAA 


9000 




CATGGTACTA 


CAGTATCAAA 


TTTATCTAGG 


GCTTAAGTTT 


GATTTTTATA ATAGGCAGGT 


9060 


TTACCTGATA 


AAAATACTTA 


TTCATTATAT 


AATGTTAACA 


ATATGTATTT 


TAAAGTTTAC 


9120 




ATTGAGTGAG 


GGATATTGAT 


GAACGTAATT 


TTAGAACAGT 


TGAAAACACA 


TACTCAAAAT 


9180 


20 


AAACCTAATG 


ACATAGCATT 


ACATATCGAT 


GATGAAACAA 


TTACATATAG 


TCAACTAAAT 


9240 




GCCCGCATCA CTAGCGCAgT TGAATCTTTG CAGAAATATT CACTTAACCC TGTCGTTGCT 


9300 




ATTAATATGA 


AATCACCGGT 


GCAAAGTATT 


ATTTGTTATT 


TAGCTTTGCA 


TCGTTTACAT 


9360 


25 


AAAGTGCCTA 


TGATGATGGA 


AGGTAAATGG 


CAAAGTACTA 


TACATCGTCA 


ATTGATTGAA 


9420 




AAATATGGTA 


TTAAAGATGT 


AATTGGAGAT 


ACAGGTCTCA 


TGCAGAATAT 


AGACTCACCG 


9480 




ATGTTTATTG 


ATTCAACGCA 


ATTACAGCAC 


TACCCCAATT 


TATTACATAT 


TGGTTTTACT 


9540 


30 


TCAGGGACAA 


CTGGACTGCC 


AAAAG CAT AT 


TATCGTGATG 


AAGATTCATG 


GTTGGCTTCT 


9600 




TTTGAAGTTA 


ATGAAATGTT 


GATGTTAAAA 


AATGAAAATG 


CAATAGCAGC 


CCCTGGACCA 


9660 


35 


CTATCGCACT 


CGTTAACATT 


ATATGCGTTA 


TTGTTTGCTT 


TAAGTTCCGG 


TCGTACTTTT 


9720 


ATAGGACAGA 


CCACTTTTCA 


TCCTGAAAAG 


TTACTTAATC 


AATGTCATAA 


AATATCATCA 


9780 




TACAAAGTTG 


CTATGTTTCT 


TGTTCCAACG 


ATGATTAAAT 


CATTATTGTT 


AGTTTACAAC 


9840 


40 


AATGAACATA 


CAATCCAATC 


ATTTTTTAGC 


AGTGGAGATA 


AGCTGCATTC 


TTCTATTTTT ' 


- 9900 




AAAAAGATAA 


AAAATCAAGC 


AAATGACATA 


AATTTGATTG 


AA'lViTT'lXiG 


TACATCGGAA 


9960 




ACCAGTTTTA 


TCAGCTATAA 


CTTGAATCAG 


CAAGCACCAG 


TTGAATCAGT 


AGGTGTGCTA 


10020 


45 


TTTCCAAATG 


TGGAATTGAA 


AACAACGAAT 


CACGATCACA 


ATGGTATAGG 


AACTATTTGT 


10080 




ATAAAAAGTA 


ATATGATGTT 


TAGTGGCTAT 


GTAAGTGAAC 


AATGTATAAA 


TAATGATGAA 


10140 




TGGTTTGTTA 


CTAATGATAA 


TGGCTATGTA 


AAAGAGCAGT 


ATTTATATTT 


AACGGGACGT 


10200 


50 


CAACAGGATA 


TGTTAATTAT 


TGGTGGTCAA AATATATATC 


CAGCACATGT 


TGAACGCCTT 


10260 




TTAACGCAAT 


CTTCGAGCAT 


TGATGAAGCA 


ATTATCATCG 


GTATTCCAAA 


TGAGCGTTTT 


10320 



55 



632 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



CAATTTTTAA AAAAGAAAGT GAAaCgnTaT GAAATTCCAT CGATGATTCA TCATGTAGAA 10440 

AAGATGTATT ACACTGCAAG tGGTaAAATT GCTAGAGAAA AAATGATGTC GATGTATTTG 10500 

AGAGGTGAAT TATAATATGA ATCAAGCAGT CATAGTTGCA GCTAAACGAA CTGCATTTGG 10560 

GAAATATGGT GGCACTTTAA AACATTTAGA GCCaGAACAA TTGCTTAAAC CTTTATTCCA 10620 

ACATTTTAAA GAGAAGTATC CAGAGGTAAT ATCTAAAATA GATGATGTAG TTTTAGGTAA 10680 

TGTTGTTGGG AATGGTGGCA ATATTGCAAG AAAAGCATTG CTTGAAGCGG GGCTTAAAGA 10740 

TTCAATACCT GGCGTCACAA TCGATCGGCA ATGTGGGTCT GGACTTGAAA GTGTTCAATA 10800 

TGCATGTCGC ATGATCCAAG CCGGAGCTGG CAAGGTATAT ATTGCAGGTG GTGTTGAAAG 10860 

TACAAGTCGA GCACCTTGGA AAATCAAACG ACCGCATTCT GTGTACGAAA CAGCATTACC 10920 

TGAGTTTTAT GAGCGTGCAT CATTTGCACC TGAAATGAGC GACCCATCAA TGATTCAAGG 10980 

TGCTGAAAAT GTGGCCAAGA TGTATGATGT TTCAAGAGAA TTACAAGATG AATTTGCTTA 11040 

TCGAAGTCAT CAATTGACAG CGGAAAATGT AAAGAATGGA AATATTTCT C AGGAAATATT 11100 

ACCTATAACC GTTAAAGGAG AAATATTCAA CACTGATGAA AGTCTAAAAT CACATATTCC 11160 

GAAAGATAAC TTTGGCCGAT TTAAGCCCGT GATCAAAGGT GGGACCGTTA CCGCTGCGAA 11220 

TAGTTGTATG AAAAATGATG GTGCAGTTTT ATTGCTTATT ATGGAAAAAG ATATGGCATA 11280 

CGAATTAGGT TTCGAGCATG GTTTATTATT TAAAGATGGT GTTACGGTAG GTGTTGATTC 11340 

TAATTTTCCT GGCATTGGTC CAGTACCAGC CATTTCCAAC TTACTAAAAA GAAATCAATT 114 00 

AACGATAGAA AATATTGAAG TCATTGAAAT TAACGAAGCG TTCAGTGCAC AGGTAGTTGC 114 60 

CTGCCAACAA GCTTTAAATA TTTCAAATAC GCAATTAAAT ATATGGGGTG GTGCATTAGC 11520 

ATCAGGTCAT CCATACGGTG CAAGCGGTGC CCAATTAGTG ACTCGATTAT TTTATATGTT 11580 

TGACAAAGAG ACTATGATTG CATCTATGGG GATAGGGGGA GGTCTAGGAA ATGCAGCATT 11640 

ATTTACTCGA TTCTAACCAG CGATTAAATG TGTCATTTTC TAAGGATAGT GTGGCTGCAT 11700 

ATTATCAGTG TTTTAACCAA CCTTATAGAA AAGAAGTACC ACCATTAATG TGTGCGTCAT 11760 

TATGGCCAAA ATTTGATTTA TTTAAAAAAT ATGCAAATAG CGAACTGATT TTAACAAAAT 11820 

CAGCAATTAA TCAAACTCAA AAGATAGAAG TAGACACAAT ATATGTAGGG CATTTAGAAG 11880 

ATATTGAATG CCGACAGACT CGCAATATCA CACGTTATAC AATGGCTTTA ACATTAACTA 11940 

AAAATGATCA ACATGTCATA ACGGTtACAC AAACTTTTAT TAAGGCGATG AAGTAGAGAT 12000 

s. 

GGAGTTTAAT GAGATATGGA TAAATGAATA TTTGGCGCTC GTAAATGATG ATAATCCAAT 12060 

ACATAATGAG ATTGTGCCAG GACAATTAGT GAGTCAAATG ATGCTGATGG CTATGTCATT 12120 
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ATTCATTGAA CAACACGAAC ACGAAATTAT 
AAAAATTTCT TTGAGCACAA AAAAATAACC 
5 GGAGATGAAA GGACAGCTAA TATCAGTTAT 

AATATAGGTT ACGTTTCTTT CTTTGCACGG 
CTATATCAAT GTTTAATAAA TTCTGGATTA 

10 

CATATGATCT ATATCGTCTT GTAATAAAGA 
TGAATCGTCA CATTTAATTG AAACATGCTG 
TGCGCCTTCA TGGTGATACT GTCGATAAAT 
TCTATGGTTA TATTATAAAT AACATTTTTA 
TTATCAGACA TAGAACGTAT GATTTACTAA 

20 TATATATTTA TAGAGTCGCC TGGCAGTCAT 

GCATCTATCG CAAAAGAATG ATAATGATAG 
CATCTTGAAA ATAAAGGGTT ATTTAGTCAT 

25 TATTGTTCGA TATGTATGAA ATTTTCAATA 

AATTTAAATT ATATACAGAG CATGATGATT 
GTTCATACCC AATTTAAGTG GTGTGGCTAA 

30 

CGTTAAACCT CTGTTACTTC AACATCGATA 
ACAGGACCAA CAAAATCATT CATTTTCCAA 
TTCGCGCTAA TCACAGCTTC TTTCGGTGAC 

35 

GCAGCAAATG TACAACCAGC ACCATGGTTA 
TGATAAAATG TTTGACCATC ATAGTATAAG 

■ 

40 CCACCTTTAA TGATGACATG CTGTGCGCCT 

ATATCTTCAA TTGAATTTAA TTTACCTAAT 
GGTGTCACTA CCGTTGCTTT AGGTAGTAAA 

45 TTAAGCACTT CATCTTCGCC TTTACAAACC 

TTAGATGCCT CATATACTTC TCCAGCACGT 
GTTTTAATAG CATCAGGTCC GATTGATAAA 

SO 

ATTGGTAATG GTGTAACATC GTGTGACCAT 
AAAGCGACCA TGCCATACGT ATCTAATTCT 
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AGCAATTAAT GACGATGGAG AGATTAAAAT 12240 

GATATTAGCT GCATGAACGC ATATTAATTA 12300 

GTATTGTTAT TATTATTGGG AACAGAGATG 123 60 

GGATGCATTA ATCTAAAATA ATAATAACAA 12420 

TTGGAACGAT TAGTCAATTT AACTAACTTT 12480 

GAGCAATTTG AATATTTCAG TATCACTAAA 12S40 

AAACGTTTTG GTTATAATTT CATAAACTGG 12600 

AATCATAACC TATATTACCT CCTTTGCTAC 12660 

TGTGTGACAT CAACCTTAAG TATCAACTTT 12720 

GACTATTTAT GTATAAAAGT TCTAAATAAA 12780 

TTGGGaAATA TAACATATAT GATTAGAGAG 12840 

AGGTATTGAG CATATAGATG AGTTTAAGTT 12 900 

AGATGTAGAT GTATAGGAAA TATTTGTATG 12960 

AAAGCTAATA ACGCTTATAT GTAACTTTCA 13 020 

ATAAAAAAAT AACCACATCA CATAAATTGA 13080 

TAATGTTGAT TTATAGATGA ACCGCCTAAT 13140 

TGTTCAATAC GGTTGTATGC ACCGTGATCC 13 200 

CCGTTTTTAA TAGCAGAAGC GACGA^AGCT 13260 

TTACCGTTAG CTAAATATGC AGTTGTTGCC 13320 

TAACTTTGTT GGAACATGTC TGTTGTTAGT 133 80 

TCATACGATT TATCTTGATC TAAAGCTTTG 13440 

TTATCAAAGA TAATTGTTGC AGCCTTTTTC 13 500 

CCTGATAATT GACCCGCTTC AAATAAGTTT 13 560 

TATTTAATCA TCGCCTCAGT ATTTCCAGGA 13 620 

ATGACAGGAT CTACTACAAA ATATTGTGCA 13 630 

TTGATTATCT CCTCAGTACC TAACATACCT 13740 

GCCGTTTCAA GTTGTTTTTC AAATACATCC 13800 

GTATCTTTAT CCATAGTAAC GATGGCAGTT 13860 

TGGAACGTTT TCAAATCTGC TTGCATACcT 13920 
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CACTCCTACA TAATAATATT GTATTCATCA 
AGCATTCAAT ATTTGATGAT TGTTGAAATG 
5 TGTCATTCAC TTTAGATAAG TGTGATATGT 

AATGGTCGCA AATTTTTCAT GACATAACAA 
TTTTAGAAAA AGAATATTCG ACTGCAATCG 

10 

CGTTTGATTT AACACCGTTT GAAAATATCA 
ATGGTCCAAA CCAAGCACAT GGATTAGCAT 
CATCTTTACG TAATATGTAT AAAGAATTAG 

75 

CGCATTTACA AGATTGGGCA AGAGAAGGCG 
GACAGGGTGA AGCAAATTCT CATCGTGATA 

20 TTAAAGCAGT GTCTGATTAT AAAGAACATG 

AGCAAAAAAT AAAGCTTATC GATACATCTA 
GTCCACTGTC TGCATATAGA GGATTCTTTG 

25 ATTTAGAGTC AGTAGGAAAA TCACCAATTA 

ATAGAGAAAC TTTAATAGCA CGAATTGAGC 
ATGACCATGA CTTTGAAAAA CATATGTATG 

30 

CAACATCAAA TACACCACAT ATTGGTGAAC 
ATCAAATGCC ACAATCACAA ATAACGCAGC 
AAGCGATGGG TGGTAAAGTA AATACGCATT 

35 

AACCTTCAAA CCAACAACAA AGATTAGCGA 
TAT&GATTT TTAAAAAGCA ACAATGAAAC 

40 GGTTAATAAT CAAGACGCAT ATACTTTTAT 

ACTGAATTAT ATAAGGAGAG GTAGCAATGA 
CGATGATGGC TGTCGGTACA GGTGCATTTG 

45 ATCACTATTT ATCAGTATGG GAAAAAGCAA 

TATTAATTAT AGGTGTAATT AGTGGTACAA 
TAATATTTGC TGGTATTATT TTCTTTAGTG 

SO 

TTAAAGTTTT AGGTGCGATT ACGCCAATTG 
TGTTAATCAT TGCGACATTC AAATTTGCTG 
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TATCATTTTT AACCTAATTG AAAAATATTA 14 04 0 

AATCATTCAT ACTATTGTAA CTTTTGAAAA 14100 

TAAAATATGT CCTGAGGTGA GATTGAATGG 14160 

CGAAACATGA CTTTAAAGCT ATGCATGATT 14 22 0 

TATACCCTGA TAGGGAAAAT ATATATCAAG 14280 

AAGTTGTTAT ATTAGGACAA GACCCGTATC 14340 

TTTCAGTGCA ACCTAACGCA AAATTCCCTC 14400 

CAGATGATAT TGGATGCGTT AGACAAACAC 14460 

TCTTGTTATT GAATACAGTT TTAACCGTAA 14520 

TTGGTTGGGA AACATTTACT GATGAAATTA 14580 

TTGTCTTTAT TTTGTGGGGG AAACCTGCAC 1464 0 

AACATTGTAT TATAAAATCA GTGCATCCTA 14700 

GATCAAAACC GTATTCCAAA GCGAATGCCT 14 760 

ATTGGTGTGA AAGTGAGGCG TAGATGTTGA 14820 

AAGAATTAGT ACAAGCAGAG CAGGCACAGC 14 830 

CCATACATAT ATTAACATCT TTATATGCTT 14 940 

AACAAATGAA TCGTCGTATT GCTAACCATA 15000 

CAACTCATCA AGTGACAGTT GCTGAAATTG 15060 

CAGCACATCA TCATAATAAG TCATATTCAC 15120 

CAGATGATGA CATTGGCAAT GGTGAATCCA 15180 

ATAATTACTT AATAGCTTGT TAAGTATGTA 15240 

TCGAGTGTTC GGATTTAAAC ATTTATTAAT 15300 

AATTATTTAT TATTTTAGGT GCATTAAACG 15360 

GTGCGCATGG TTTACAAGGA AAAATAAGTG 154 20 

CGACGTATCA AATGTACCAT GGCTTAGCAT 15480 

CTTCAATCAA TGTTAACTGG GCTGGCTGGT 15540 

GATCATTATA TATTTTAGTA TTAACTCAAA 15600 

GTGGCGTATT GTTCATCATT GGATGGATAA 15660 

GTTAAATTTT AAAACTTTAG ATTACCTATG 15720 
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TGGGTATAGA ATACCTTCGA GGTGAGTTTT 
ATAGAGGCGA TTTAAAACAA AACCTATCTG 
5 CATGTATCGG ATGGGGCGCA TTCATCTTAC 

TTGCAGCATC AATTGGTATA GTTATTGGTG 
ATGGCGCATT AGTAGAGAGA TTTCCAGTAT 

10 

GTTTCGGCAG ATATGTGAGT TTCTTCTCAT 
TCGTTGCTTT AAAtGCGACC GCATTCAGTT 
TAAATAATGG GAAACTATAC ACCATTGCGG 

75 

TTGCGACCGT ATTACTACTT GTATTCATGC 
GATCATTACA ATATTATTTC TGTGTGGCGA 

20 GTTCATTCTT TGGTAATAAT TTTGCACTTG 

AAGGATGGTT AGTGTCTATT GTGGTTATTG 
TTGATAATAT TCCACAAACA GCAGAAGAGT 

25 TTATCGTGTA CAGTTTATTA GCAGCATCAT 

GTTGGTTATC AACAAGTCAT CAAAGTTTAA 
CACAAACAGC ATTTGGTTAT ATTGGATTAG 

30 

TATTTACTGG TTTAAATGGA TTCTTGATGA 
GTTCAGGTAT TATGCCAACA ATGTTTAGTA 
TCGCAATCAT ATTCCTAGTA GGAGTGTCGT 

35 

TGACTTGGAT TGTAGATATG TCATCTACTG 
TGTGTGCAGC GAAATTATTC AGTTATAACA 

40 AAACGTTTGC TATTATCGGC 7CATTTGTAT 

CAGGTTCTCC TGCAGCACTG ACTGCACCGT 
TCGGTTTAAT ATTCTTTGTG ATTCGATATC 

45 TAAGTCGCTT GATTTTAAAT AGAAGTGAAA 

AAAAAGAAAA AACTAAATAA TAAAAGAATC 

ATCGTGCGAT TTTTTGTATT ATAAATTGAC 

so 

TAATTGCTAA GAGTTAGGGC TGAGCCATTT 

TTCACGAACC CAGAAACAAT TAATTTGGAA 
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TATTTATGGA AAAAAAGAAT AAGCAAATAG 15840 

AAAAGTTTGT ATGGGCGATT GCATATGGTT 15900 

CAGGAGACTG GATTAAGCAG TCAGGTCCGA 15960 

CATTATTAA7 GATATTAATT GCGGTTAGTT 16020 

CAGGGGGCGC GTTTGCCTTT AGTTTCTTAA 16080 

CATGGTTTTT AACTTTTGGT TATGTCTGTG 16140 

TACTAGTTAA ATTCTTATTG CCAGATGTCT 16200 

GCTGGGACGT TTATATTACG GAAATCATTA 16260 

TAGTAACGAT TCGTGGCGCA AGTGTATCTG 16320 

TGGTAATCGT CGTATTATTG ATGTTCTTTG 16380 

AAAATTTACA ACCGTTAGCT GAACCTAGCA 1644 0 

TATCCGTGGC ACCATGGGCA TATGTTGGAT 16500 

TTAACTTTGC ACCAAACAAG ACATTTAACC 16560 

TAACTTATGT TGTCATGATT TTATACACTG 16620 

ATGGGCAGTT GTGGTTAACA GGTGCTGtTA 16680 

GTGTATTAGC AATTGCAATT ATGATGGGTA 16740 

GTTCAAGTCG CTTGTTATTT TCTATGGGAC 16800 

AATTACATAG TAAATACAAA ACACCATATG 16860 

TAATTGCACC TTGGCTAGGA AGAACTGCAT 1692 0 

GTGTATCCAT TGCCTACTTT ATTACATGTT 16980 

AACAAAGTAA TACGTATGCA CCGGTTTACA 17040 

CATTCATTTT CTTAGCGTTG TTATTAGTGC 17100 

CTTATATTGC ATTACTTGGA TGGTTAATCA 17160 

CTAAATTGAA AAATATGGAT AATGATGAAT 17220 

ATGAAGTTGA TGATATGAT7 GAAGAACCTG 17230 

GCACAATAAA CCTTCTTCAT TCGGAGGCGT 17340 

ATTTAAGACG AGGCAGCTGA ACCTTATATA 17400 

CTAACAAATA TTTATAATCG TTTAAAAGAT 17460 

ATTTGGTCGG CGAATAATAA ACCTAATGCG 17520 
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AAGACTAAAT TTTTTGTAGC ATCGTATGCT AAGCCACCAG GTACTAATGG AATGATACCC 17640 

GTTACCATAA AAATGATGGC AGGTTCTTTT TGTTTACGAG CCATATAATG ACTTAACAAG 17700 

5 CCTAATGCTA AACTACCAAA GAAACTAGAG TATATAGTGT GCACATTAAA GCCGTTGAAG 17760 

AATAAGGTGT AAACCATCCA TCCACACGTA CCAACGAAAC CACATGATAG ATATAATTTT 17820 

CTAGGTGCAT CAAAAATGAC GCAGAA 17846 

10 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5544 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

ATTGACACTT GGTGAAAGTA ATATCGCCGC GCTATTTTGG CAAAATGGAC ACTTAGAACC 60 

TGAGTTACAA GATGAACAGC CAATTAATAT ATTAGGATCT GkTCAAATCA ACGAATGGAA 120 

25 TGGTAATCAA TCACCGCAAA TAATTATTCA AGATATTGCG ATGAATGAAC AGCAAATATT 180 

AGATTATAGA AGTAAGCGAA AAAGTTTACC TTTTACAGAA AATGATGAAA ATATTGTCGT 240 

GCTTATTCAT CCTAAAAGTG ATAAAGTAAA TGCGAATGAA TATTATTATG GTGAAGAAAT 300 

30 

TAAACAACAA ACTGATAAAG TAGTATTAAG AGATTTACCA ACGTCAATGG AAGACTTGTC 3 60 

TAATTCCTTG CAACAACTGC AATTTTCTCA ACTTTATATA GTTTTGCAAC ATAATCATTC 420 

GATTTACTTC GATGGTATAC CTAATATGGA TATTTTTAAA AAGTGTTATA AAGCATTAAT 4 80 

35 

AACTAAACAA GAAACAAATA TCCAGAAAGA GGGTATGTTA TTGTGTCAAC ATTTAAGTGT 540 

GAAACCAGAT ACACTTAAAT TCATGTTGAA AGTTTTCTTA GACTTAAAAT TTGTAACACA 6 00 

AGAAGATGGT TTAATTCGAA TCAATCAACA ACCTGATAAA AGATCGATTG ATTCCAGCAA 660 

40 

AGTATATCAA TTAAGACAAC AACGTATGGA TGTTGAAAAG CAATTATTAT ATCAAGATTT 720 

TTCAGAAATA AAAAATTGGA TAAAGTCACA ATTGTCGTGA GCAATTTAGG AGGAAATATT 780 

4S AATGGATTTA AAGCAATACG TATCAGAAGT TCAAGATTGG CCGAAACCAG GTGTTAGTTT 84 0 

CAAGGATATT ACTACAATTA TGGATAATGG TGAAGCATAT GGCTATGCAA CAGATAAAAT 900 

TGTAGAATAC GCAAAAGACA GAGATGTTGA TATCGTTGTA GGACCTGAAG CGCGTGGCTT 960 

SO 

TATCATTGGC TGTCCTGTAG CTTATTCAAT GGGGATTGGC TTTGCACCTG TTAGAAAAGA 1020 

AGGGAAATTA CCTCGTGmAG TCATTCGTTA TGAGTATGAC CTAGAATATG GTACAAATGT 1080 
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ATTAGCTACT GGTGGTACGA TTGAAGCAGC AATAAAATTA GTTGAAAAAT TAGGCGGTAT 1200 

CGTAGTAGGT ATTGCATTTA TAATTGAATT GAAATATTTA AATGGTATTG AAAAAATTAA 1260 

AGATTACGAT GTTATGAGTT TAATCTCATA CGACGAATAA TAAATAATAT AATTTTATCA 1320 

AATGAAATCC TTCATCAAAT GTATAAGAAC CAATGACTTA ATTAAAAAAG TTGTTTAAGT 1380 

TTTCTTAACA TGAGATGTTA GGATTTTTTA TTTACTGAAA ATGTTAGATG ATTGAGCATT 1440 

ATACCTTAAT AACATCGTTT ATTTATTTCA TAAATTGTAG TATCATAGAA CTAATATTTA 1500 

AAAAATGAAA CAGTAGATTT AGGTCGAATT TTTGTAAAAG TTTTAAAAGT AGGAATAGTA 1560 

TACAAATTAA ACTCGCTCAA GTAAAATTAA TATTACGATT AATGACGACA GGATAAATAT 1620 

TTATCGTCGA CGGACGTATG ATTGGTGTGG GACAAATACT ATTCAACAAG AGTACCTAAA 1680 

TCATTGTTTA AGGCGAAGTA ATAAATATGA ATGGGGTGTA TCATATAATG AACAACGAAT 1740 

20 ATCCATATAG TGCAGACGAA tTCTTCACAA AGCAAAATCA TATTTGTCAG CAGATGAATA 1800 

TGAGTATGTT TTAAAAAGCT ATCATATTGC TTATGAAGCA CATAAAGGTC AGTTCCGAAA i860 

AAACGGATTA CCATACATTA TGCATCCTAT ACAAGTTGCA GGTATTTTAA CAGAAATGCG 1920 

25 ATTAGACGGA CCGACGATTG TCGCAGGTTT TTTGCATGAT GTAATTGAAG ATACAC CGTA 1980 

TACATTTGAA GATGTAAAAG AAATGTTCAA TGAAGAAGTT GCTCGAATTG TTGATGGTGT 2040 

GACGAAGCTT AAAAAAGTAA AATACCGCTC AAAAGAAGAA CAACAAGCTG AAAATCATCG 2100 

CAAGTTATTT ATTGCGATTG CCAAAGATGT ACGCGTAATT TTGGTGAAAT TAGCAGACAG 2160 

ATTACATAAT ATGCGTACCT TGAAAGCCAT GCCGCGCGAA AAACAAATTA GAATTTCTCG 2220 

AGAAACATTA GAAATTTATG CACCATTAGC ACATCGTCTT GGTATTAATA CAATCAAATG 2280 

GGAACTAGAA GATACGGCTC TTCGTTATAT TGATAATGTG CAATATTTTA GAATAGTCAA 2340 

TTTAATGAAG AAGAAACGTA GTGaACGTGA AGCGTATATC GAAACGGCTA TTGATAGAAT 2400 

40 ACGTACTGAA ATGGACCGAA TGAATATCGA AGGCGATATA AATGGTAGAC CTAAACATAT 2460 

TTACAGTATT TATCGGAAAA TGATGAAGCA GAAAAAACAA TTTGATCAAA TTTTTGATTT 2520 

GTTGGCGATA CGTGTTATTG TCAATTCTAT TAATGATTGT TATGCGATAC TTGGGTTGGT 2580 

45 GCATACGTTA TGGAAACCGA TGCCAGGACG TTTTAAAGAT TATATTGCAA TGCCTAAACA 2640 

AAATTTGTAT CAGTCATTGC ATACTACAGT AGTAGGCCCA AATGGAGACC CGCTCGAAAT 2700 

CCAAATACGA ACGTTTGATA TGCACGAAAT TGCTGAGCAT GGTGTTGCAG CACACTGGGC 2760 

SO 

TTACAAAGAA GGTAAAAAAG TAAGTGAAAA AGATCAAACT TATCAAAATA AGTTAAATTG 2820 

GTTAAAAGAA TTAGCTGAAG CGGATCATAC ATCGTCTGAC GCTCAAGAAT TTATGGAAAC 2880 
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TGAGTTGCCA TATGGTGCTG TGCCGATTGA TTTTGCTTAT GCGATTCACA GTGAAGTAGG 3000 

TAATAAGATG ATTGGTGCCA AGGTGAATGG CAAAATTGTA CCAATTGACT ATATTTTACA 3060 

AACAGGCGAT ATTGTTGAAA TACGTACTAG TAAACATTCA TATGGACCAA GTCGTGATTG 3120 

GTTGAAAATT GTTAAATCGT CTAGTGCCAA AGGTAAAATT AAAAGTTTCT TCAAAAAACA 3130 

AGATCGTTCA TCTAATATTG AAAAAGGCCG AATGATGGTT GAAGCTGAAA TAAAAGAGCA 3240 

AGGATTTAGA GTCGAAGATA TTTTGACAGA GAAAAATATT CAGGTTGTTA ATGAAAAATA 33 00 

TAACTTTGCA AATGAAGATG ATTTATTCGC AGCTGTAGGA TTTGGCGGCG TGACATCCTT 3360 

. ACAGATTGTT AATAAATTAA CTGAAAGACA. ACGTATTTTA GATAAACAAC GTGCTTTAAA 3420 

TGAAGCACAA GAAGTTACGA AATCATTGCC TATTAAAGAC AACATCATTA CTGATAGTGG 34 80 

TGTCTATGTA GAAGGTTTAG AAAATGTACT TATCAAGTTG TCAAAATGTT GTAATCCTAT 3540 

20 ACCaGGTGAT GATATTGTAG GTTATATCAC CAAAGGTCAC GGTATTAAAG TACATCGCAC 3600 

TGATTGCCCA AATATTAAGA ACGAAACTGA ACGACTAA7T AATGTTGAAT GGGTAAAATC 3 6 SO 

AAAAGACGCA ACTCAAAAAT ATCAGGTTGA TTTAGAGGTA AtGCGTATGA CCGAAATGGC 3720 

25 TTGTTGAATG AAGTACTACA AGCTGTTAGC TCGACAGCCG GCAATTTAAT TAAAGTTTCA 3780 

GGACGTTCAG ATATTGATAA AAATGCAATA ATAAATATTA GTGTCATGGT GAAAAACGTG 3840 

AATGATGTTT ATCGTGTGGT AGAAAAGATC AAACAACTTG GTGATGTTTA TACAGTAACA 3 900 

AGAGTTTGGA ACTAGAGGTG CAAAATATGA AAGTAGTTGT ACAAAGAGTT AAAGAAGCAT 3 960 

CGGTGACGAA TGATACATTA AATAATCAAA TCAAAAAAGG ATATTGTTTA TTAGTCGGTA 4 020 

TCGGTCAGAA CTCTACAGAG CAAGATGCAG ATGTAATTGC AAAGAAAATT GCTAATGCAA 4 030 

GATTATTTGA AGATGACAAT AATAAATTAA ACTTTAATAT CCAACAAATG AATGGTGAAA 4140 

TACTATCAGT TTCACAATTT ACTCTCTATG CAGATGTAAA AAAAGGTAAC CGTCCAGGTT 4200 

TCTCAAATTC TAAAAATCCT GATCaAGCGG TAAAAATTTA TGAGTATTTT AATGcaTGCG 42 60 

CTACGAGCGT ATGGTCTTAC TGTGAAAACA GGTGAATTTG GAACACACAT GAATGTTAGC 4320 

ATAAATAATG ATGGTCCAGT CACTATTATT TATGAAAGTC AGGACGGCAA AATTCAATGA 43 30 

45 AAAAAATAGA GGCATGGTTA TCTAAAAAGG GTCTTAAAAA TAAACGTACT CTAATAGTAG 44 40 

TGATTGCCTT TGTCTTATTT ATCATCTTTT TATTTTTATT GCTGAATAGC AATAGTGAAG 4500 

ATAGTGGGAA CATCACGATA ACTGAAAATG CTGAATTACG TACAGGTCCA AACGCTGCGT 4560 

ATCCAGTCAT ATATAAAGTT GAAAAAGGTG ACCATTTTAA AAAGATTGGT AAAGTAGGTA 4620 

AATGGATTGA AGTTGAAGAT ACATCCAGTA ATGAAAAAGG TTGGATAGCT GGATGGCACA 4630 
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TAGTGCTTGA TCCTGGTCAT GGAGGTAGTG ACCAGGGTGC TTCAAGCAAT ACTAAATATA 
AAAGTTTAGA AAAAGATTAT ACGTTGAAAA CAGCAAAAGA ATTGCAGCGT ACTTTAGAAA 
AAGAAGGCGC AACTGTTAAG ATGACAAGAA CAGACGATAC ATATGTTTCA CTAGAAAATC 
GTGATATCAA AGGCGATGCC TATTTGAGTA TACATAATGA TGCGTTAGAA TCATCTAATG 
CAAATGGAAT GACaGTTTAT TGGTATCATG ATAATCAAAG AGCTTTAGCA GATACGTTAG 
ACGCTACGAT TCAGAAGAAA GGTCTACTTT CTAATCGCGG TTCAAGACAA GAAAATTATC 
AAGTGTTAAG ACAAACAAAA GTTCCTGCTG TTTTATTAGA ATTAGGTTAT ATTAGTAACC 
CAACTGATGA AACGATGATT AAAGATCAAT TACATAGACA AATTTTAGAA CAAGCAATTG 
TTGATGGCCT TAAAATTTAT TTTTCTGCGT AGGGCTTGCA AAAATATGTG AAAGTAGTTA 
TCATTGATAT TGAATTTTAT AACTAAAACC GTTAGTATTC TTGAAATGGT AAATGAAATA 
GGTAGCAATC TAACTAAGAT TGTGTAGGAA TATAATCCAT AGACTGAAAG ATTATGCTGA 
GTAGTTTATA TACATTGAAC ACAAGAAGAG GTGCTTTATG AAAAGTAAAG CCGTTAAACG 
TACGTTaAAC GTTTTGAGTG GGTTTATTAA ATGCACGCTT ATAAAAAGTA ATGATGATTA 
CAATTAGGCA TGTTTTTTAA ACCA 
(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1067 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
AAAAGATTGC AAATATAAAT GGCATGTTTA ATATGTTAGA ACAACAAATC ATTCATAGCC 
AAGATATGGC TCATTTTAGA AGTGAATTTT TTTACGTCAA TCATGaGCAT CGAGAAAACT 
ATGAAgCACT CCTAATTTAT TACAAAAATA GTATCGACAA TCCTATTGTA GATGGTGCAT 
GTTATATTTT AGCCCTACCT GAAATTTTCA ATAGTGTTGA TGTTTTCGAA TCAGAGTTAC 
CATTTTCATG GGTATATGAT GAAAATGGCA TTACCGAAAC AATGAAATCA CTTAGCATTC 
CATTACAATA TTTAGTTGCA GCAGCTTTAG AAGTAACTGA TGTGAATATA TTTAAGCCTT 
CAGGATTTAC AATGGGAATG AATAATTGGA ATATTGCTCA AATGCGAATC TTTTGGCAAT 
ATACAGCAAT TATTAGAAAA GAAGCACTAT AACATTAATA ATTAATTAGC TATAAAGATG 
ATTCACAACA ATCATCTTTA TAGCTTTTTT ATGTCTAATT ATTTTTGAGG AAAATtnACAA 
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AATTTTATGT TTTCAAAAGT AAACAATCAA AAGATGTTAG AAGATTGCTT CTATATAAGA 660 

AAGAAAGTGT TTGTAGAAGA ACAAGGCGTC CCTGAGGAAA GTGAAATTGA TGAATATGAA 720 

TCTGAATCTA TTCACCTCAT TGGATATGAT AATGGACAGC CAGTTGCCAC TGCTCGAATA 780 

CGCCCTATTA ATGAAACAAC TGTCAAAATA GAACGAGTAG CTGTGATGAA ATCACATCGT B40 

GGACAAGGAA TGGGTAGAAT GCTTATGCAA GCTGTAGAAT CATTAGCTAA AGATGAAGGT 900 

TTTTACGTAG CTACTATGAA TGCCCAATGT CATGCTATCC CATTTTATGA AAGTTTAAAC 960 

TTTAAAATGA GAGGTAATAT ATTTCTTGAG GAAGGCATCG AG CAT ATTGA AATGACAAAA 1020 

AAGTTAACCT CGCTTAATTA AAAAAAGTTG TATCTATTTT AGAAACA 1067 
(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18613 base pairs 

(B) TYPE: nucleic acid 

<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



25 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

AAGACGtACG ATAACAACAA TACgTGTAGT GAAAGATTTT AATCTACATA TTACTGACAA 60 
AGAATTCATT GTATTTGTTG GACCATCGGG ATGTGGTAAA TCAACAACAT TACGAATGGT 120 



35 



40 



TGCTGGACTA 


GAGTCTATCA 


CATCTGGAGA 


TTTTTATATT 


GATGGGGAAC 


GCATGAACGA 


180 


TGTTGAACCA AAGAATAGAG ATATTGCGAT 


GGTATTTCAA 


AACTATG CAT 


TATATCCACA 


240 


TATGACTGTT 


TTTGAAAATA TGGCATTTGG 


GCTAAAGCTA 


CGTAAAGTAA 


ATAAAAAAGA 


300 


GATTGAACAA AAAGTTAATG 


AAGCAGCTGA 


AATATTAGGA 


TTAACTGAGT 


ATCTTGGTCG 


360 


TAAACCAAAA 

• 


GCGTTATCTG 


GCGGACAGCG 


TCAACGTGTT 


GCTTTGGGCA 


GAGCTATTGT 


420 


TAGGGATGCG 


AAAGTCTTTT 


TAATGGATGA 


ACCATTATCG 


AATCTTGATG 


CGAAyTtCGA 


480 


GTACAAATGC 


GCACAGAAAT 


ATTGAAATTA 


CATAAGCGAC 


TTAATACTAC 


GACAATTTAT 


540 


GTTACACATG 


ATCAAACTGA 


AGCATTGACG 


ATGGCTAGTC 


GAATTGTTGT 


TTTGAAAGAT 


600 


GGCGACATTA 


TGCAAGTCGG 


CACACCTAGA 


GAAATATATG 


ATGCCCCTAA 


TTGCATATTT 


660 


GTGGCGCAAT 


TTATCGGCTC 


ACCAGCAATG 


AATATGTTGA 


ATGCTACAGT 


TGAAATGGAC 


720 



GGATTGAAGG TAGGAACACA CCATTTTAAA TTACATAATA AAAAATTTGA AAAGTTAAAA 780 

SO 

GCTGCTGGCT ACTTAGACAA GGAAATTATT TTAGGTATTC GAGCTGAAGA CATTCATGAA 84 0 

GAACCAATAT TTATTCAAAC TTCTCCAGAG ACACAATTTG AATCTGAAGT AGTTGTATCC 900 
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AAATTAGATT CAAGAACTCA AGTGATGGCG 
AATAAGTGTC ACTTTTTTGA TGAAAAAACA 
5 ATGTCTAAAA TTTTAAAATG TATCACGTTA 

TGTGGCCCTA ATCGTTCGAA AGAAGATATT 
GACAAGCCTA ACCAACTTAC GATGTGGGTG 

10 

AAAATTACGG ATCAATATAC TAAAAAAACT 
CAAAATGATC AACTAGAAAA TATTTCGCTA 
TTTTTCTTAG CACATGATAA TACTGGAAGT 

15 

AAATTATCAA AAGATGAGTT GAAAGGTTTC 
GACAATAAGC AACTAGCATT GCCAGCTATC 

20 AAATTAGTGA AAAATGCACC GCAAACGTTA 

ACTGATAGTA AAAAGAAACA ATACGGTATG 
TATCCGTTTT TATTCGGCAA TGATGATTAT 

25 AT7CATCAGC TAGGACTAAA TTCAAAACAT 

TGGTACGACA AAGGGTATCT TCCTAAGGCA 
AAAGAAGGAA AAGTAGGACA ATTTGTCACT 

30 

ACGTTTGGTA AAGATTTAGG AGTAACAACA 
CCATTTCTAG GTGTACGTGG TTGGTATTTA 
AAAGATTTAA TGCTGTATAT CACTAGTAAA 

35 

AGCGAAATTA CTGGACGTGT TGACGTGAAA 
AAGCAAGCAC GTCATGCTGA ACCGATGCCT 

« 

40 CCGATGGGCA ATGCAAGCAT ATTTATTTCA 

GAGGCGACGA ATGATATAAC GCAAAATATT 
AAAGGAGATT AGTTATGACG AAACGTAACC 

45 CTGGTTTGGG ACAGTTTTAT AATAAAAGAC 

TCATCAGTTT TATTTCTGTT TTTTATAGCT 

CATTAGGGAC AGTACCTAAG TTAGACGATT 

SO 

CTATCTTACT CGTTGCTTTC GCAATCATGC 

GTAATGCTGA ACGATTTAAT CGCAATGAGG 

55 



AACGACAAGA TTACACTAGC ATTTGATATG 1020 

GGAAATCGTA TCGTCTAAGG GGGAGTATTC 1030 

GCCGTGGTAA TGTTATTAAT CGTAACTGCA 1140 

GATAAAGCAT TGAATAAAGA TAATTCTAAA 1200 

GATGGCGACA AGCAAATGGC GTTTTATAAA 1260 

GGCATCAAAG TAAAGCTTGT AAATATTGGT 1320 

GACGCTCCTG CAGGAAAAGG TCCAGATATC 1380 

GCCTATCTAC AAGGCTTAGC TGCTGAAATC 1440 

AATArGCAAG CACTTAAAGC GATGAATTAT 1500 

GTTGAAACAA CCGCACTTTT TTATAATAAA 1560 

GAAGAAGTTG AAGCTAATGC TGCCAAACTA 1620 

TTATTTGATG CTAAAAATTT CTATTTTAAT 16 80 

ATTTTCAAGA AAAATGGCAG TGAATATGAT 1740 

GTCGTCAAGA ATGCTGAACG ATTACAAAAA 1800 

GCAACACATG ATGTCATGAT TGGTCTTTTT 1860 

GGACCGTGGA ACATTAATGA ATATCAAGAA 192 0 

TTACCTACAG ATGGTGGCAA ACCTATGAAA 1980 

TCTGAATATA GTAAACATAA GTATTGGGCT 204 0 

GATACATTAC AAAAATATAC AGATGAAATG 210 0 

TCATCTAATC CAAATTTAAA AGTGTTTGAA 2160 

AATATTCCTG AAATGCGACA AGTTTGGGAA 2220 

AATGGTAAGA ATCCTAAACA AGCGTTAGAT 2280 

AAGATTCTTC ATCCATCACA AAATGATAAG 2340 

CTAAATTAGC GGCATTATTA TCTGTTATAC 240 0 

CCATTAAAGG GACGATATTT TTTATCTTTT 2460 

TTTTAAATAT TGGTTTTTGG GGATTGTTCA 2520 

CTCGTGTCTT ACTTGCACAA GGTATTATTT 2580 

TATATATCAT TAATATTTTA GATGCATATC 264 0 

AAATAAAGGA TCCGAAGcGC GTATGGTGGC 2700 
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TGTAGTTGTA TTTCCATTAA TAyyTATGTT 
CAACGCGCCT CCGAGACACA CATTAGAATG 
5 CACAATTGGC GTTTGGCGTA AAACATTTTT 

GCTTGTTGCA ACGACACTTC AAATTGCATT 
CCCTGTCGTC AAAGGTAAGA AATTTATCCG 

10 

ATCATTTGTG ACAATTTTAA TATTTGTAGC 
TAATGATATT TTGCAACCTT TATTAGGTGT 
GGCAAAAGTG GCATTAATCG GCATTCAAGT 

15 

GTTCACTGGA GTACTGCAAA GTATTTCATC 
TGCGTCTAGT TGGCAAAAGT TTAGAAACAT 

20 GCCATTGTTA ATTATGCAAT ATGCAGGTAA 

TAATAAAGGC GGTCCACCAG TGTCAGGGCA 
TTGGGTGTAT AATCTGACAT TTGAGTTTAA 

25 AATTATTGGA TTTATTGTTG CTATTGTCGC 

TAAAGATGAG GGAGGTTTAT AAGATGACAA 
TTTACAGTTT TATAGCGATG ATGTTTGTCA 

30 

GCATTTCCCT TAATCCAGGT ACGAACTTGT 
CATTTAAAAA TTATGCATTC TTACTATTCG 
AAAATACGCT TATCGTAGCA TCTGCAAATG 

35 

CAGCATATGC TTTTTCTAGA TATCGCTTTG 
TGATTTTACA AATGTTCCCT GTATTAATGG 

40 CAATTGGATT ATTAGATTCT TTATTTGGAC 

CGATGAATGC CTTTTTAGTG AAAGGTTACT 
CTGCCAAAAT TGATGGTGCA GGGCATATGC 

45 CTAAGCCGAT TTTAGCAGTT GTTGCTTTGT 

TATTACCTAA AATACTATTA AGAAGTCCTG 
ACTTTATTAA TGATAAGTAT GCAAATAATT 

SO 

TTGCAGTACC TATAGCAATC GTATTCTTGT 
CAACAGGTGC GACAAAAGGT TAGTTTGAAA 

55 



TGGAGTAGCA TTTACAAATT ACAATTTATA 2820 

GGTTGGTTTA GATAACTTTA AAACGTTATT 2880 

CAGTGTTATT ACTTGGACAT TAGTATGGAC 2940 

AGGGCTGTTT TTGGCAATTA TTGTAAATCA 3000 

TACTGTGTTA ATCCTACCTT GGGCTGTACC 3060 

GTTATTTAAT GATGAATTTG GTGCGATAAA 3X20 

AGCACCAGCA TGGTTAAGTG ATCCGTTTTG 3180 

ATGGCTTGGA TTCCCATTTG TCTTTGCACT 324 0 

AGATTGGTAC GAAGCAGCAG ATATGGATGG 3300 

CACATTCCCG CATGTCATTT ACGCCACAGC 3360 

TTTCAATAAT TTTAATCTTA TTTATCTATT 3420 

GAATGCTGGT AGTACAGATA TCTTGATATC 34 30 

CAACTTCAAC ATGGGTGCAG TTGTGTCATT 3540 

ATTTATTCAA TTCAGACGTA CAAGTACGTT 3600 

AGAAGAAAAA CATATTAAAA GCAATCGGTA 3660 

TCATTTTATA TCCACTACTG TGGACATTTG 3720 

ATGGTGCCAA AATGATACCA GACAATGCAA 3780 

ATGACAGTAG TCAATACCTG ACTTGGTATA 334 0 

CACTGTTTAG TGTGATATTT GTCACGTTAA 3 900 

TTGGTCGTAA ATACGGGCTG ATTACATTTT 3 960 

CAATGGTCGC AATCTATATT TTGCTAAATA 4 020 

TAACACTGGT ATATATTGGT GGATCAATAC 4 080 

TCGATACGAT TCCAAAAGAA CTTGATGAAT 414 0 

GTATTTTCTT ACAAATTATG CTTCCATTAG 4200 

TCAATTTTAT GGGGCCATTT ATGGACTTTA 4260 

AAAAATTCAC ATTAGCAGTT GGATTGTTCA 4320 

TCACAGTGTT TGCAGCAGGG GCAATTATGA 4 380 

TCTTGCAACG CTATTTAGTA TCAGGTTTAA 4440 

TTAGGAGTGG GGCAGAATTG ATAAAGAACC 4500 
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15 



GGGTGTGGTG GTATTGCGAA TGGCAAGCAC ATGCCAAGTT TACAAAAAGT TGAAAATGTT 4 620 

GAAATGATCG CATTTTGTGA CGTAGACATT TCGAAAGCAG CGAGTGCGGC AGAAGCATAC 4680 

GGAACTGACA ATGCAAAGGT TTATGATGAT TACAAAGCAT TGTTAAAAGA TGACACGATT 474 0 

GATGTTATCC ATGTTTGTAC GCCAAATGAC TCGCATTGTG AAATTACTGT AGCAGGGTTG 4800 

CATGCTGGTA AACATGTGAT GTGTGAAAAA CCAATGGCTA AAACGACAGC AGAAGCTCAA 4860 

AAAATGATAG ATACAGCTAA ATCAACAGGT AAAAAATTAA CAATAGGTTA TCAAAATCGT 4920 

TTCCGAGCAG ATAGTCAATT TTTACATCAA GCAGCGCAAC GTGGCGACTT AGGAGACATT 4 980 

TACTTCGGAA AGGCACATGC CATTCGTCGT CGAGCAGTAC CAACATGGGG TGTCTTTCTA 5040 

GACGAAGAAG CTCAAGGTGG AGGACCATTA ATCGATATCG GTACACACGC TTTAGATTTA 5100 

ACGTTATGGA TGATGGATAA TTATGAACCA GAATCAGTGA TGGGTTCAAC ATTCCATAAA 5160 

20 TTAAATAAAC AGCATCATGC GGCAAACGCT TGGGGTTCAT GGAATCCAGA TGAATTTACA 5220 

GTTGAAGATT CTGCGTTTGG ATTTATTAAA ATGAAGAATG GAGCGACGAT CATTTTAGAA 5280 

TCCGCTTGGG CGATTAATTC TTTAGAAGTG GATGAGGCAA AATGTTCATT ATCAGGAACT 5340 

25 AAAGCAGGTG CTGATATGAA AGATGGTCTA CGTATTCATG GTGAAGACAT GGGTACACTT 5400 

TATACCAAAC ACGTTGAATT GGAAAACAAA GGCGTCGACT TTTATGAAGG TAATGAAGTG 5460 

GATGAAGCTG AAGAAGAAGC AAAAGCTTGG ATTGATGCAG TTGTAAATGA TACTGAACCA 5520 

GTTGTGAAAC CGGAACAAGC AATGGTAGTT ACAAAAATTC TTGAAGCGAT TTATCAGTCT 5580 

GCAAAATCAG GCAAAGCAAT TTACTTTGAA TAACA7CATA CGGTAAGGAG GCACATCATG 5640 

ACAAAATTAA AAGTTGGTGT GATAGGTGTT GGTGGTATTG CACAAGACCG TCATATTCCA 5700 

GCATTGCTGA AACTCAAAGA CACAGTCTCA TTAGTTGCAG TACAAGATAT TAATACAGTG 5760 

CAGATGATTG ATGTTGCGAA gCGCTTTAAT ATACCTCATG CAGTTGAGAC ACCTAGCGAG 5820 

CTGTTTAAAC TTGTTGATGC GGTGGTCATT TGTACACCTA ATAAATTCCA TGCTGATCTT 5880 

TCTATAGAAG CATTGAACCA TGGTGTCCAT GTATTGTGTG AAAAGCCAAT GGCGATGACG 5940 

ACGGAAGAGT GTGATCGCAT GATTGAAGCG GCTAATAAAA ATCACAAATT ATTAACTGTC 6000 

45 GCATATCATT ATCGTCACAC AGATGTGGCA ATTACTGCTA AAAAAGCAAT TGAATCAGGT 6060 

GTGGTTGGTA AACCTTTAGT AGCACGTGTA CAAGCGATGC GTAGGCGTAA AGTGCCTGGC 6120 

TGGGGTGTTT TTACCAATAA AGCGTTGCAA GGTGGCGGTA GTTTAATCGA TTATGGTTGC 6180 

CACTTGTTAG ACTTATCTTT GTGGCTACTA GGTAAAGATA TGGTGCCGCA TGAAGTGCTA 6240 

GGAAAAACAT ATAATCAATT GAGCAAACAA CCGAATCAAA TTAATGATTG GGGAACATTT 6300 
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GCAAGCATGC 
AGTTTATCAG 
TTTGGAACTA 
AGACAGGCGC 
GAAGAAGCAC 
AAGAGCATAC 
GAGTGCTTTT 
GATATGTTAG 
AACCCAGGAG 
GCATTTATGA 
AATCCAATTT 
ATCCGTTTAG 
TCAGATGATA 
GAAATTTATG 
TTTGCAAAAG 
ACACCATATA 
GATCCTAGTC 
CAAGCAAATG 
AATATGTATG 
TTCCGTACAG 
ATTATTAATG 
GAAGAAGGTT 
GCAGACATGT 
ACTGGTGGCA 
TTAAATCAAG 
TACCCATTCA 
GGTATATGAT 
GGTTTCTGAA 
TGATGACAGG 



AGTTTGAATG 

GAGAAGATGG 

TTTTTGAAAG 

GTAACTTTGT 

GCAATGTAAA 

AACTTTAATG 

CAATGAAAAT 

ATTATGTCTC 

ATAAATTTTG 

AGTCAATCAC 

CTCCAGATCC 

CAAATCTATT 

CCGCTAAAAA 

ATTATCAGTG 

AGCAAGATGT 

CAATGTTGAA 

ATCTATGGTG 

CAATTCATCA 

GTCTAACTGA 

T7GGTTATGG 

GATATGATTA 

TCCAAAAAGC 

GGTGGGCATA 

GTGTTGAATA 

TCATTGTTTG 

CAGTAACAAT 

AATAAAAAAA 

TATAATATTT 

CTTTCATCTT 



TTCGTGGTCT 

CGGTATCAAT 

CAAAGCTAAT 

CAATGCGTGT 

TGCCCTTATA 

ATTATCATAT 

AGGTGTATTT 

AGAATCTGGA 

TAAGTTAGAT 

AGACAGAGGC 

GATAGAAGCG 

AGACGTGCCA 

GCCTAATTGG 

GAATGAAAAG 

AAAAATTGCC 

GTTACGTGAG 

GCAAGGTATT 

CTTCCATGCT 

TATGCAACCA 

ACATAGTCCA 

TGTATTAAGT 

TTGTCAAACT 

ATACGAACTC 

AATGCATATG 

TAAAGAAGGT 

CCTCACCATT 

GCCTGTTGTC 

CAGAATGCAC 

TTTAAATATT 



GCAAATATCA 

TTATTTCCAT 

GTTGAGCATA 

TTAGGGATAG 

GAAGCGATTT 

ATGATACAAA 

TCAGTATTAT 

TTGGATATGA 

GAGTTGTTAG 

TTACAAATAA 

AAAGAAGCCG 

GTTGTTAATA 

CCTGTTACAC 

TTGATACCAT 

ATAGAGTTGC 

GCTACAAATG 

GACCCAATTG 

AAAGATACGT 

TATGGTAACG 

TATGTATGGG 

ATTGAACATG 

TTGAAATCTG 

GAGGTTAGTC 

TCGCCAAGCC 

GTACTTTATA 

GAAAAGAGTA 

ACAATGGTCA 

TTTAAAGATG 

CATTAATTTC 



AAGAAGATAA 

TTGAAATATA 

ACGAAGACAT 

AAGAGATTGT 

ATCGTAGCGA 

ATTCTCAATA 

TTTACGATAA 

TTGAAGTTGG 

AAAATGAAGA 

GTGGTTTCAG 

ATGAAACGTT 

CATTTTCTGG 

CTTGGCCAAC 

ATTGGCAAGA 

ATGCAGGATT 

AATATATCGG 

CTGCGATTCG 

ATATTAATCA 

TTGCGACAAG 

CAGATATCAT 

AAGATCCTAT 

TTAATATTTA 

TGAAGTTTGT 

ATTGCCAAAA 

TAAGTATATA 

TATAAC CTTT 

TAGACACGAC 

GACGTCGATG 

TCTTCTTGTT 



GGTTCACGTT 

TGAGCCCCGC 

TGCTGGTGAG 

GGTGAAACCG 

TCTTGATAAC 

TAAAAAGAAG 

AAATTTTGAA 

AACAGGTGGT 

CAAGCGCCAA 

TTGTCATAAC 

ACGTAAAACA 

CATTGCAGGA 

AGCCTACTCT 

TTTAGCTGAG 

TTTAGTGCAT 

TGCTAACTTA 

CATATTAGGC 

AGAAAATGTA 

AGCATGGACA 

AAGTCAACTT 

TATGTCAGTA 

CGACAAGCCA 

CTGAAGTAAG 

ATTTCACACC 

GCGATGGTCA 

TCAATAGTGA 

ATACTTTAAA 

TAGACTAAAG 

TAATACGTAC 
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TAATACACCG ATTAATTCAG GAATGATGTT TAAGAAGTAA TTTGGGTGTT TTGTAATTTT 
ATATAATCCA GATTTAATAA TAGGATGGTT AGGTAAAATG AATAATTTTA ATGTCCAAAT 
ACCACCTAAA GTTTTAATAA CCATAAATAA CATGATATAA GCAAAGATTA ATATAACTAA 
GCCAATACCA TTTGCAAAGC TAAATGTATC TTTATTAATA AATGCCTCTA CACCAGCCAA 
TACATAAATT AAAACGTGTG TTATTGCTAA AAACTTCGAA TTTTTAACGC CATATTCAAC 
TGCACCGTCT GCTTTTAATT GTTTTGAGTG ATTAATAGAT ATCTTTAAGC TGACAAGTCT 
GATACAGAAA AAGATAAGTA ATATAGATAG AATCATGATG TCCTCCGTCA TTATGTCATA 
TGTATAAGCG TTGATTTTGA CAACATAAAG TATTTTATAG ATAAAGCTTG TCAAATACTA 
TTAACTATTT ATTAATTTTA GTACATAAAT ATGTTTCTAA GTATGTGTTT ATGTTCAGTA 
TTTTGGATAA TTTAATAATT TTAAGGATAT TAAGCGCTTA CACCGACGTG ATATATTTGG 
CTTAACGAAA ATGATTGAGG TGACAGAGAT GAACTTTTTT GATATCCATA AGATTCCGAA 
CAAAGGCATT CCATTATCGG TACAACGTAA ATTATGGCTT AGAAACTTCA TGCAAGCTTT 
CTTCGTAGTG TTCTTTGTTT ATATGGCTAT GTATTTAATT CGAAACAACT TTAAGGCGGC 
ACAACCGTTT TTAAAAGAGG AAATTGGATT ATCTACATTA GAACTTGGTT ATATCGGATT 
AGCATTTAGT ATCACGTACG GTTTAGGAAA AACATTACTT GGATATTTTG TCGATGGACG 
TAACACAAAA CGTATTATCT CGTTCTTACT TATCTTATCT GCGATTACAG TTTTAATTAT 
GGGATTTGTT TTAAGTTACT TTGGTTCTGT AATGGGATTA TTAATTGTAC TTTGGGGACT 
TAACGGGGTG TTCCAATCAG TTGGTGGACC TGCAAGTTAT TCAACGATTT CAAGATGGGC 
GCCAAGAACG AAACGTGGCC GATACTTAGG ATTCTGGAAT ACATCACATA ATATCGGTGG 
TGCCATAGCA GGTGGTGTTG CACTTTGGGG TGCTAATGTA TTCTTCCATG GAAATGTTAT 
AGGGATGTTC ATTTTCCCAT CGGTGATTGC ATTACTTATT GGTATCGCAA CATTATTTAT 
CGGAAAAGAT GATCCGGAAG AATTAGGATG GAATCGTGCT GAAGAAATTT GGGAAGAGCC 
GGTCGATAAA GAAAATATTG ATTCTCAAGG TATGACGAAA TGGGAGATCT TTAAAAAATA 
TATCCTGGGA AATCCTGTTA TATGGATTCT ATGTGTTTCA AACGTCTTTG TATACATTGT 
ACGAATCGGT ATTGATAACT GGGCACCGTT ATATGTGTCA GAGCATTTAC ACTTTAGTAA 
AGGCGATGCA GTTAATACGA TATTCTACTT TGAAATTGGT GCATTAGTTG CAAGTTTATT 
ATGGGGCTAC GTATCAGACT TATTAAAAGG TCGTCGTGCA ATTGTAGCTA TTGGCTGTAT 
GTTTATGATT ACATTTGTTG TCTTATTCTA CACAAATGCT ACAAGTGTCA TGATGGTTAA 
CATTTCATTG TTTGCATTAG GTGCGTTAAT CTTTGGTCCG CAATTATTAA TTGGTGTATC 



B220 
8280 
6340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
B820 
8880 
8940 
9000 
9060 
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9180 
9240 
9300 
9360 
9420 
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9540 
9600 
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9840 
9900 



55 



646 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



CGCGTATCTA TTCGGTGACT CAATGGCGAA AGTTGGTTTG GCGGCTATTG CTGATCCAAC 10020 

ACGTAACGGT TTAAACATCT TTGGATATAC ATTAAGTGGA TGGACAGATG TTTTCATCGT 10080 

CTTCTATGTT GCATTATTCC TAGGCATGAT TCTATTAGGA ATCGTTGCTT TCTATGAAGA 10140 

AAAGAAAATT AGAAGTTTAA AAATTTAATA TAAATCGGAT TAAAAGTATC GCCAATCTAT 10200 

TGCAATATAG TTGGCAATCC TGCCCCGACG GCATGTGCGT GAAGAGATGA AAGATACTGC 10260 

TTCTACCCTT GCAAATATAT CATCTCTATG TCTCGGGGCA GATCATAATT CCCTGTTATG 10320 

AAGTATCCTT ATTTGCCCGA CTTAGGGTGA CTCAATGAAT TTACTCCTTA CAATAAAGAC 10380 

ATATAGCGGT GTCAATATTG TAGGGAGTAT TGTTTTATAT TTAAACTCTC TAAAAAGCGG 10440 

ACTGAAAGAA AAGTGAAAAC TTCTCTATCA GTCCGCTTTT TCATAGAACA AAATGGAGGC 10500 

GCCATAATCA TTAGTTATGT GCTAATCTAT TTTGCTTGCT TACAATAATC ACTTGGCGAC 10560 

ATTTGTAAAT ATTTTTTAAA ATGATAGCTA AACATTTTAT ACTCTGAAAA GCCTACTTTG 10620 

TCTGCAATTT CATAGTGTTT GTAATGTCGA TCTAACAATT GCAGAGATTG TAAAATACGA 10630 

TAGCGATTTA AATAATCGAC AATTGTAATA CCAACATGAT CTTTAAATGT TCGCATCGCA 10740 

TACGATTCAC TAACATCGAT ATGTTGAATT AAATCTGAAA CAGtCACTTT CGTTTGATAA 10800 

GATTGCTTAA TTTGATCCAC AATCTGGTTT ACATAATAAT CATCGTATTC TACTTTTAAT 10860 

AGTGGTTGGA AGGCATCATG ACAAGATGCT AAGCTACGGC CGTTCTGTGA TTGTTGCTCT 10920 

AATAAGGTAC GGACAAGTCT TCCTAAAATA ACTTCTAATT GTGCATGGTC TACTGGTTTT 10980 

AATAAATAAT CAAGAACATG ATGTTGAATG CCGGCTTTCA TATATTCAAA GTCATCGTAA 11040 

CTCGATAATA TGATGACATT ACAATCTAGA TGCGCAATAT CATTGAGTAA ATCGACGCCA 11100 

TTTTTA CGTG GCATACGAAT ATCAGTAATT ACTAATTCTG GCTGATGTTG TTGAATTAGT 11160 

GATAATGCTT CAACACCATC TTTAGCAGTG TATATTGTAT TGAAATGATA GTCTCCCCAA 11220 

GGAATGATTT GCTTTAATCC TTCTCGAATA ATTCGTTCAT CATCACAAAT AACTACCTTA 11280 

AACATCTACA TTCCCCCTTG AAAGTGGTAT TTTATAACAA ATTAACGTAC CTTGATTACG 11340 

CTTTGAAAAA ATATGGAGTC GTGCATGTGA ACCATATTGA ATCATTGCTT TATTGTGTAA 11400 

ATGATTTAAT CCCAAATGCT TAGTATCAAA TACATCATTA TTAAGAGATT GGCGTACATA 114 50 

TTGCAGGCGA GATGACGACA TCCCGATACC ATTGTCGCAA ACTAAAACAT GTAAATTCTG 11520 

ACGTGCCAAT GTCAGGCGTA TAGTAATGTC CAATGACTCA GTATCTCTAC CATGTTTAAT 11580 

AGCATTTTCT ATGAGTGGCT GAAGCATCAT TTTACCAATT GTCTGGTGAC GCGCTTCTTC 11640 

AGAACTTTCA ATATGGAGCT TAATCATGTC ATCAAAACGG aTGTTTTGTA TTGCAACATA 11700 
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GTAACGTAAC ATTTGCGATA ATTGTTGGAC CACAGTTtGT GCTAATTTCG GAGATAACGT 11820 

AATTAAATAT TGTATTGTTT GCATCGTATT GAATAGGAAA TGAGGCTGGA ATTGGCGTTC 11880 

TATTTCCTTT AACTGAATAT CACGCAAGCG ACGTTCTGTA TGCTCGATAG AATGGATCAG 11940 

TTGCTCATTT GATTCAAATA AATCGTAAAT ATAATTATTA ATTTCTTCTA GTTCACTGTT 12000 

GTTTTTTAAA GGCGTATATG TACCTAGATG ACGATTTTTG GCATAGTAAA TTTTTTGAAT 12060 

AATCGTTTCG ATATCTTTTG TTTGTCGTTT AGCCATATTA TCTGCGCTAA TGAAACCAAA 12120 

TATTACTAGT AAAACAAGAA CTACGGCCAT AACAATTAAC AACGTGATAC CATCTTCAAT 12180 

GTTTTCATGT ATATCTTTAT AAATAATGAG ACGATGGTCA GCATGGTTTA ATTTTACAGA 12240 

TTCATTCATA AATCCGAATT GTTGTGGTcT ATACTTTTCA CCTATAGTAA AACGGTCATC 12300 

GTTGGCGTAT AAAATATTGT CATATTGATC AmCGATAAGT GCGAATTGTC GGTTATCTTT 12360 

CtTAATTTCA CTTAAACGTG GGGTGTtAGC CATATAAATt TTaAGCATAT ATGTACTATT 1242 0 

TTTGAATTTA AGCTGATGCG TTGAAAATAA ATACATATTT TTAGTGTTTA AATGTTCATA 12480 

ATTATTGGTT ATAAACTGAT TTGGTCCAGA TAATTCATAA TAAAGTGTTG CGGGCTGTTG 12540 

GkGTATTAAT TTTAATAATT CACGTTTTGT AGCGGTCACA TCATGATGAT TTGyTAAATC 12600 

GAGCTCTTGA AACGAATTAT TATGCTGTGT AATAAATGTC TGAATCTGCT TTTCAGTATG 12660 

ATGTAAAGAT GACTGACTTT CATCAACATG TTGATGAATC GTACGATGCT CAATCCAAAT 12720 

ATAGATGGCA TAGAAGCTTA CTAGTCCAAT AATAATGACT AAAAATACTG GAAAAATAGT 12780 

AGACnCAAAT AACGATCGTC TTAATTGATG TCTATAAGGT TTGTATGCCn TCATTGAATC 12840 

ATCTCCAAAA ATTTATGATG TGGAATATCC GGTAATTTAG ATTTCGGTAT TAAAGGTATG 12900 

TTCTTAAGAT TTTCGATAGA CTGATCGCTT TGTTCACTAA CATCCTTTCG AATTGACTTG 12960 

GCATCGAACT CTGCAACTAA TCGTtGTTGT ACTGAGCGGC TTGTTAAATA TTGCACTAAC 13020 

TTTTTACGCT TAGGATGAGG GTGTGCATTT TTAACTAAAG CAATrCCATC AACATTTAAC 13080 

ATTGTTCCTT CAATTGGATA AACGATTGAT ACAGGATAAC CTTTGTTTTT CCATGTGCGT 1314 0 

GCATCTTGTT CGTAGCTTAG ACCTGCGTAA TATTTACCTT TTGCAACATC TTCAATGACT 13200 

TTAGACGTCT TTGACAGTTG CATCGCATGG TTTTGGAATT GATGCACATC ACTTACTCGA 13260 

TGATGCATGC TATAAATAGC ACGCATATGT TGATAGCCTG TCGTTGTTGT ATTTGGATTT 13320 

GAGTACGCAA TTTTACCTTT AAGTATAGGT TGTAATAAAT CTTGATAACC TCGAATCTTA 13380 

ATATCTCCTT GTAAATCTGA ATTCACTACT ATAACTGTTG GCATTAATAG AAAACTAGTA 13440 

ACATATTTAT TGTTCGAGCG ATAATCCTCT AATTGCTGTG TTACAGATGT ATCTTGATAG 13500 
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CCACGCTCCG AAAAATCTTC GTTATGCAAG TTTGAAAGCA GTACTTGAGT AGATCCGTGT 
TTAATTTCAA TTTTGACATG CTCTTGTTTT TCAAATTCAT TTAAAATTGG ACGAATCAAG 
TTTGATTGAT ACGGAGAATA AACTGTTAAT ACATTTTTAT CGGATTCAGA GTGACGCGTA 
TTAGCGCATG CTGaTAAAAA AATGAGAAAT AATAGCAAGA TATAAATTTT TGATTTCATG 
ATATCCCATC AATTCTATGT ATATTTTAAT ACAATAATTT TAGCAATAAA TGACGCATAA 
GTAATGTTAA ATATTTAGAA ATGTTTATAG ATGACTTGTT AAGACGTTGC AAATGTTGTG 
ATAGCACAAA ATTTTTGTTT GTCAAGACGA TTTACCGAGG CTGTAAAATC AAACTGTTAT 
ATTTTATTTG TAGCTGTTAT ATAAAAATCG GCAAGATATT GAACGGTTCA AAAGTGAATT 
TTTACGTCAA TAAAAGTATT TAATCCAGTC TCTTCATATA TAAAAGTAAA TCTTTCTAAG 
TGTTGATTTA ACGCTTATCA ACAATCATTT TTTATAAACA AATATATACT CCTAAATTAA 
CTTTTAAAGC AATGAAAATA GTGAACATTA TAACTGTTGT GTAACAGAAT GCAATTAGCA 
TATTACTGTT ACACAAATTA GTACAGTTTC TATGTTTTGA CATACATTTG ATGAAAATTG 
TACATAATTT ATGTGAAAAA AATCACAACA AACATGCTAC AATGACTATG AAAACGTTAA 
CATAGCATTT CAAATTCACA ACATTATACA GATGGAGGCG TTTAGTATGT TAGAAACAAA 
TaAAAATCAT GCAACAGCTT GGCAAGGATT TAAAAATGGA AGATGGAACA GACACGTAGA 
TGTAAGAGAG TTTATCCAAT TAAACTACAC TCTTTATGAA GGTAATGATT CATTTTTAGC 
AGGACCAACA GAAGCAACTT CTAAACTTTG GGAACAAGTA ATGCAGTTAT CGAAAGAAGA 
ACGTGAACGT GGCGGCATGT GGGATATGGA CACGAAAGTA GCTTCAACAA TCACATCTCA 
TGATGCTGGT TATTTAGACA AAGATTTAGA AACAATTGTA GGTGTACAAA CTGAAAAGCC 
ATTCAAACGT TCAATGCAAC CATTCGGTGG TATTCGTATG GCGAAAgcAG CTTGTGAAGC 
TTACGGTTAC GAATTAGACG AAGAAACTGA AAAAATCTTT ACAGATTATC GTAAAACACA 
TAACCAAGGT GTATTCGATG CAT ATT CT AG AGAAATGTTG AACTGCCGTA AAGCAGGTGT 
AATCACTGGT TTACCTGATG CATACGGACG TGGACGTATT ATCGGTGACT ATCGTCGTGT 
AGCTTTATAT GGTGTAGATT TCTTAATGGA AGAAAAAATG CACGACTTCA ACACGATGTC 
TACAGAAATG TCAGAAGATG TAATTCGTTT ACGTGaAGAA TTATCAGAAC AATATCGTGC 
ATTAAAAGAA TTAAAAGAAC TTGGACAAAA ATATGGTTTC GATTTAAGCC GTC CAGCAGA 
AAACTTCAAA GAAGCAGTTC AATGGTTATA CTTAGCATAC CTTGCTGCAA TTAAAGAACA 
AAACGGTGCA GCAATGAGTT TAGGTCGTAC ATCAACATTC TTAGATATCT ATGCTGAACG 
TGACCTTAAA GCAGGCGTTA TTACTGAAAG CGAAGTTCAA GAAATTATTG ACCACTTCAT 
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AGACCCAACT TGGGTAACTG AATCTATCGG TGGTGTAGGT ATTGACGGAC GTCCACTTGT 15420 

TACGAAAAAC TCATTCCGTT TCTTACACTC ATTAGATAAC TTAGGTCCAG CTCCAGAACC 15480 

AAACTTAACA GTATTATGGT CAGTACGTTT ACCTGACAAC TTCAAAACAT ACTGTGCAAA 15540 

AATGAGTATT AAAACAAGTT CTATCCAATA TGAAAATGAT GACATTATGC GTGAAAGCTA 15600 

TGGCGATGAC TATGGTATCG CATGTTGTGT ATCAGCGATG ACAATTGGTA AACAAATGCA 15660 

ATTCTTCGGT GCACGTGCGA ACTTAGCTAA AACATTACTT TACGCTATCA ATGGTGGTAA 15720 

AGATGAAAAA TCTGGTGCAC AAGTTGGTCC AAACTTCGAA GGTATTAACA GCGAAGTATT 15780 

AGAATATGAC GAAgTATTCA AGAAATTTGA TCAAATGATG GATTGGCTAG CAGGTGTTTA 15840 

CATTAACTCA TTAAATGTTA TTCACTACAT GCACGATAAA TACAGCTATG AACGTATTGA 15900 

AATGGCATTA CATGATACAG AAATTGTACG TACAATGGCA ACAGGTATCG CTGGTTTATC 15960 

AGTAGCAGCT GACTCATTAT CTGCAATTAA ATATGCACAA GTTAAACCAA TTCGTAACGA 16020 

AGAAGGTCTT GTAGTAGACT TTGAAATCGA AGGCGACTTC CCTAAATACG GTAACAATGA 16080 

CGACCGTGTA GATGATATTG CAGTTGATTT AGTAGAACGC TTCATGACTA AATTACGTAG 16140 

TCATAAAACA TATCGTGAT7 CAGAACATAC AATGAGTGTA TTAACAATTA CTTCAAACGT 16200 

TGTATACGGT AAGAAAACTG GTAACACACC AGACGGACGT AAAGCTGGCG AACCATTTGC 16260 

TCCAGGTGCA AACCCAATGC ATGGCCGTGA CCAAAAAGGT GCATTATCTT CATTAAGTTC 16320 

TGTAGCTAAG ATCCCTTACG ATTGCTGTAA AGATGGTATT TCAAATACAT TCAGTATCGT 16380 

ACCAAAATCA TTAGGTAAAG AACCAGAAGA TCAAAACCGT AACTTAACTA GTATGTTAGA 16440 

TGGTTACGCA ATGCAATGTG GTCACCACTT AAATATTAAC GTATTTAACC GTGAAACATT 16500 

AATAGATGCA ATGGAACATC CAGAAGAATA TCCACAGTTA ACAATCCGTG TATCTGGTTA 16560 

CGCT&TTAAC TTCATTAAAT TAACACGTGA ACAACAATTA GATGTAATTT CTCGTACATT 16620 

CCATGAAAGT ATGTAACAAA ATTTAAGGTG GGAGCACTAT GCTTAAGGGA CACTTACATT 16680 

CTGTCGAAAG TTTAGGTACT GTCGATGGAC CGGGATTAAG ATATATATTA TTTACACAAG 16740 

GATGCTTACT TAGATGCTTG TATTGCCACA ATCCAGATAC TTGGAAAATT AGTGAGCCAT 16800 

CAAGAGAAGT CACAGTTGAT GAAATGGTGA ATGAAATATT ACCATACAAA CCATACTTTG 16860 

ATGCATCGGG TGGCGGTGTA ACAGTCAGTG GTGGCGAACC ATTGTTACAA ATGCCATTCT 16920 

TAGAAAAATT ATTTGCAGAA TTAAAAGAAA ATGGTGTGCA CACTTGCTTA GACACATCGG 16980 

CTGGATGTGC TAATGATACA AAAGCATTTC AAAGGCATTT TGAAGAATTA CAAAAACATA 17040 

CAGACTTGAT ATTATTAGAT ATAAAACATA TTGATAATGA CAAACATATT AGATTGACAG 17100 
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TATGGATTCG ACATGTCCTT GTGCCTGGTT ATTCTGATGA TAAAGACGAT TTAATTAAAC X7220 

TAGGGGAATT TATTAATTCT CTTGATAACG TCGAAAAGTT TGAAATTCTG CCATATCATC 17280 

AGTTAGGTGT TCATAAGTGG AAAACATTGG GCATTGCATA TGAATTAGAA GATGTCGAAG 17340 

CGCCCGATGA TGAAGCTGTT AAAGCAGCCT ACCGTTATGT TAACTTCAAA GGGAAAATTC 17400 

CCGTTGAATT ATAAATACAA TTCAGACCGA AAAGAAAGCA TATGCAACTT CAAGAGTGAA 17460 

GGGGCATATG CTTCTTTTTC AATTGAGTAT TGAGTATTAG CAAGACGTAG TAAGTATATG 17520 

AGACAACTTC TACAATGGTT GAAGGAAGAC GTTTTTGTAA GTAGCTATGC TGATAAAGAA 17580 

TGTGATGTCT TGTTAAAGGT GGGGTTCCAA TATCATCATT TAGCTGATGT TGAATGGGTT 17640 

ATTATTTGCT ACTTGCATAT GAATATGAGT CTTTTCAAAT TTTTATTGAC CCTGAGTAAT 17700 

GAAAAATATT AAGATGAAAC TTAATATTAA AgCAATGCGG AGCGTGATTA TGAAGAGAAT 17760 

TAGTAAAGAT ATATGGGCAG TATTTAAATT ACTGTATCaA AATAAAGGGC GTTTTAGCAT 17820 

TAATGCCTTA CTATTGCAGT TAATCATGAT TTTTATTAGT AGTACATACT TAATTTTACT 17880 

ATTTAATATG ATGTTAAAAG TAGCTGGcAA AGCCAACTTA CGATTAACAA TTGGACGGAA 17940 

ATCGTTAGTC ATCCCGCCAG TGTGATACTT CTTATTATAT TCATATTAAG TGTTGCCTTT 18000 

CTGATTTATG TAGAGTTTTC ATTGTTAGTT TATATGGTTT ATGCCGGCTT TGATCGACAG 18060 

ATTATTACAT TTAAATCCAT TTTTAAAAAT GCCTTTGTAA ATGTGCGTAA ACTCATAGGT 18120 

GTACCAGTTA TTTTCTTTGT CATTTATTTA ATGTTAATGA TACCCATTGC CAACCTAGGA 18180 

CTAAGTTCAG TATTAACAAA AAATATTTAC ATACCTAAAT TTTTAACGGA AGAACTTATG 18240 

AAAACGACGA AAGGTATAAT CATTTACGGT ACCTTTATGA TTGCTGTATT TATATTAAAT 18300 

TTTAAATTAA TATTTACTCT ACCGTTAACG ATTTTAAACC GCCAGTCGTT ATTTAAAAAT 18360 

ATGAGACTAA GTTGGCAAAT TACGAAGCGA AATAAGTTTC GGCTTGTTAT AGAAATAGTT 18420 
♦ 

ATATTAGAAC TCATCATTGG TGCGATTTTA ACATTAATTA TTTCAGGAGC AACATATCTT 18480 

GCTATTTGTG TAGATGAAGA AGGAGATAAG TTTTTAGTCT CATCAATTTT ATTTGTTGTA 18540 

TTGAAAAGCG CATTGTTCTT CTATTATkTA TTtACGAAAT TATCATTAAT CAGTGTGTTA 18600 

GTACTGCACT TAA 18613 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1214 base pairs 
(3) TYPE: nucleic acid 
(C) STRAND EDNESS : double 

<D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

AAAGTTTTAA AAGGGGTGAG ATACTTGGCG AATAATCCAT TCCAGCTTTG CGTTTAAAAG 60 

5 GAATTATACT TGCCATTGTC GGTGCTTGTT TATGGGGATT AGGTGGTACT GTTTCTGATT 120 

TCTTGTTCAA ATATAAGAAT ATTAATGTCG ATTGGTACGT CACTGCTCGA CTTGTAGTCA 180 

GTGGTGTTTT CTTACTTATT ATGTACAAAA TGATGCAACC CAAACGTTCA ATATTTAGCG 240 

10 TATTCCAAGA TCGACGTATG TTAGGCAAAT TACTTATCTT CAGTATACTG GGCATGTTAG 300 

TAGTACAATA TGCTTATATG GCATCTATTA ATACAGGTAA TGCTGCGATT GCAACATTAC 360 

TACAATACAT TGCGCCAGTT TATATTATTA TTTGGTTTGT CATAAGAGGC GTTGCAAAAC 420 

75 

TAACATTATT TGATGTGCTT GCTATTATCA TGACACTATT AGGAACATTT TTATTATTAA 4 80 

CAAATGGTTC ATTTTCTAAT TTAGTCGTCA ATCCTGCAAG TTTATTCTGG GGTATTTTAG 540 

2Q CTGGTGTAGC ACTCGCTTTT TACACAATTT ATCCTTCAGA CCTACTTAAC CGCTTCGGTT 600 

CGATTCTAAT TGTCGGGTGG GCAATGCTTA TTTCTGGTGT TGCGATGAAT TTACGCCATC 660 

CAATTTGGCA CATTGATATC ACTAAATGGG ACATATCAAT TATATTATTT TTAATCTTTG 720 

25 GTATTATCGG TGGTACCGCA CTCGCATTTT ATTTCTTTAT CGACAGTTTA CAATACATAT 730 

CAGCGAAAGA AACAACATTA TTCGGAACTG TTGAACCTGT CGTAGCCGTT ATCGCAAGCA 840 

GTCTATGGTT ACATGTGGCA TTCAAACCAT TTCAAATCGT AGGCATCATT CTTATTATGA 900 

30 TTTTAATTTT ATTACTATCA CTTAAAAGAC AACCTGAAAC ATTAGATGAA TAAGAAAACT 960 

CTGATAATCA CTTTAGCAAG TAACTATTAT TTAACAACGT AGTTACCTTA TAGGTGATAT 1020 

CAGAGTTTTT TATTTTAGTT AATAATATTT TTCACTTGGT ATAAAAAaGC GTCGTCGCTC 1080 

35 

TGGTAATCGG AAATACTGGA ATAAAATATG GAATTGGGTA ATAATCCCAG GTAnTAAAAG 1140 

TCCAIGTTCC GATAnCCTnT CCGCAnCTCC AACCAAATTT GCCGATAAGG TTCCAAAAGG 1200 

CATCCTGGGG GTAC 1214 

40 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9458 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

50 (xi> SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

ATTTTGGTTT CATTCACGAT GGGGTnATAC AGCAAACACA nCTAAAATAA CTATCAATAG 60 
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CTTAGACAAT AAAAAATATG CCACTACAAT CGCTAATATT ACGATTAAAA AAGAAGCGTT 180 

AACGATTACT TTCATCGTTG TTCTATCTCT GAACATCATA TTAAAGACAA CTAGACTAAT 240 

TGATAATGAA ACAGCAAAAA AAGTAATAGC TAACACTAAT TTCATCATAA ATAGACAGAC 3 00 

TAAACCTATG ACTAATAATG TATTAGAAAT TACAGCTGAC GTTTTTAACA TTCTCGaATT 360 

AATATGCACT CACCCTTTTT ATTTAAATAA CTTACATAAT CATAATAATA CATGATGTTT 420 

CATAGGCCTG TCGATGATTG ATTCACAATA GCACGTGATT TTTTTGTTTT TCAATATTAT 480 

TCATTTATTC CATCAAAAAC ACCCTTTTTA ATTTTTACAA AAATTAAAAA AAGTGCTCCT 540 

ACACTGCTTG CATGTAGAAA CACTTTTTCA TTGTAATGTT ATTCTTCTCG AGACATACCT 600 

TTTAGCATAT TAAGCATGTA TGTTAAACTA CGGTTCATGT CGTCATCTTT CAATACGCCC 660 

AATAGACTTC TTATAGTTGT CTTAGCATTT GGACTCGCTT GATTGGCAAC GTGTAATCCT 720 

20 TTATTAACTT TATTTAGGAA GTCGCTTAAA TCTGATACAT TGAGTTCACC TAATAAAAAT 780 

ACCATTGAAG CCATATTAGA TAATAGCCCT GTATAAATAT CTTTATTAAG TTCAACTGGA 840 

AATTTATTTA TGATGACTTG ACGTCCTCGA ATTGCACCAT TTAAAGCATC TAATAGTTTT 900 

25 GCATCATCTA ATGTTTTAAT AAGCTTGATT GCTTTTAATA TACTATCTTT ATTCGCTGCA 960 

ATTGCCTCTG TAACTTCATT TAAACTTTCT AACTTAATTT GTTCTTCTGA TTTTTCTAAG 1020 

CGTCTAATTT TAGAAGATAT TCTCTCAGCC ATTATTTATC CACCTGATTT CCCGGGAAAA 1080 

CATAATCTGA ACGTTCCCAT TTTTTCTGTA CTTGAACACT GTACTGCGGT TGACGTTTTT 1140 

TATTGACACG GAAATTATTA GGGTTCAACG GTGACTTACC ACGTTTCGTA ATTACCTCCA 1200 

AACGACAGCT AGTACGTTTA TAAGATGGTG TATCCGTGTA TTGATCAACA TCACTaTTAG 1260 

TTAATAAGTT AATTGCACCT AGATCTCCAT TTTCCATCGC aTCaTTATTT AATGGAATAT 1320 

AGATTTCTTT ACCTTTAACA CGATCTGTCA CGTGAACTTG TAATACCGCT TCTCCTGTyT 1380 

* 

CAGAAATCAG CTTAACTTCT GCACCTTCAT GAATGCCTCT ATCTTCAGCA AGCTCTGGAG 1440 

AAATTTCAAC AAATGCACGT GGCACTTTGT ATTTAATCAT TGGTGTTTGA TAAGTCATAT 1500 

TACCTTCATG GAAGTGCTCT AACAATCGAC CATTGTTTAC ATGAATATCA TAAATTTCAT 1560 

45 CTTGCTTAAA GTAATTATCA AATGATAATG GGAATAATTT TGCTTTACCA TTATCAAAAT 1620 

TGAATCCTTC TAAGTATAGA ATAGGCTCAT CAGTACCATC AGGTTGTACT GGCCATTGTA 1680 

AACTATTGAA TCCTTCTAAA CGATCATAAC TTACCCCAGC ATATAGAGGT GTTAAGCGTG 1740 

CTACTTCATC CATAATTTCA CTAGGATGCT TGTAATTCCA ATCAAATCCT AATCTATTAG 1BO0 

CAATTGCTTG GAAAATTTTC CAGTCAGGTT TTkAATCACC AAGAGGTTCT AATGCTTGGT 1860 
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TTGCTGGCAA TACAACATCT GCGTATGTTG 
CCATGAAATC TAATTTTTCA AACGCAGCTT 
5 CCGTATCTTC ACCATATAAG TACAATGAGT 

TTTCATGATT ATCTTTACCA GCTTTTGGAT 
TAGCGCGAAT ATCATCCGCT TCAATACTTT 

10 

CCATATCACT ACATCCTTGA ACATTATTAT 
GACGACGATA ATTACCTGTT ACTAATAATA 
CAATGTCTTG TTGTGTAATA CCCATTGCCC 

15 

ATTCTTCAGC AAATTTAATC AATTCTGATT 
CCATTGTAAA TGTTTCTAAT GATTTGTAAT 

20 TAAATGCTTT ATCGTGTAAA TCATGATCAA 

CTAAATCCGT ACCTGGTTTA GGTTGATAAA 
TTCTAATATC AAATACATGT ATTTTTTGAC 

25 ATGCGATAAC TGGATGAGCT TCGGCTGTAT 

TTTCTAAATC TTCAATACTA CCTGAGTCAC 

TTGTTGCAGG TGCTTGGCAA TATCTTGAAC 

30 

GTCTTGCTAA TTTTTGCATT AAATACGATT 

ATGATAGTGC ATCTGGGCCA TGCTTTTCTT 

TTAAAGCTTC ATCCCATTCT ACTTCATGGA 

35 

TTAATCGTTG ATCTGAATTA ATATGTCCCC 
TTTTATTTGC TGG AGAATCA TGTGATGGTT 

40 AAACTTCAAA TGAACAACCC ACACCACAAT 

GCTCTTTACG CATTTCTGCT TCTGAATCTG 
CTGCTTTTTT AGTTAAATCA ATCATTGCTG 

45 AACCCGCATT ACCTTCCATA TTCACTTCCA 
ATTGACCACA AGATACACAT GAAGACTCAT 
GTGGATGTTC ACGATCCCAA TCAATTCTAA 

50 

CTTCTACACA ACGCCCACAT AAGATACATT 
AATCTTTTTC GTATGGCTTC TCTTTATATT 
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CTGTGAATGT TAAAAATTCA TCTTGGACTA 1980 

GTACAAAATT AATATTTGAA TCCACAATAC 2040 

GTACTTCTCC GTCATGTATA CCTTCTACCA 2100 

TCAATTTAAC GCCATATTCT TTTTCAAATT 2160 

GATAACCAGT AATCTTATCA GGCATACTTC 2220 

GTCCACGTAA TGGATACGCA CCAGTACCAG 2280 

AGTTTGAAAT CGCTGTACTT GAGTCACTAC 2340 

AACAAATTAC AACAGATTCA GCTTTAGCAC 2400 

CAGGAATACC TGTTGCTTCT TCAGCAAAAG 2460 

ATTCATCAAA ATCATCTACC CACTCATCAA 2S20 

TAATATACTT AGTCACTGCA CTTAACCACG 2580 

AACGATCCGC ACGTTCTGCC ATTTCATGTT 2640 

CAAATAATTT TTGTG CACGT TTCATGCGTG 2700 

TAGTACCTAT CAATACAGAC ATTGCCGCTT 2760 

CGCCGTGTCC AACCGTTCTA AATAAG CCTT 2820 

AGTTATCAAC GTTATTTGTG CCAATAACTT 2880 

CTTCATTCGT CGCTTTAGAA GAAGAAATGA 2940 

TAATAGCTGT AAAATTATCT GCAATGACGT 3000 

ACTCACCATT TTTCCTTACT AGTGGTTTAG 3060 

ATGAAAACTT ACCTTTAACA CAAGTCGCAA 3120 

GTACTTTTAA AATTTCTCTA TCTTTAGTCC 3180 

AAGTACACAC TGTTTTAGTT TTCTTAATAC 3240 

AGATTGCAAA TAGTGGACCA TAACCAGGTT 3300 

CTAATGAACC AGGTTCCGTA TCAGTCATAT 3360 

TCATGGCATT ACATGGACAT ACCGTCGCAC 3420 

TAATCGGTAC ATCATTATCC CAAATAACAC 34 80 

TAGTTTCATT CACTTCGATA TCTTGACATG 3540 

GATTTGGATC ATAACGATAA AATGGG CCGT 3600 

CATACGTTTG ATGCTGAAGC CCCCATGCAT 3660 
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TATGCTTTTC TAAAATTCGA TCAAGCGCTT CTTTTTGAGC ATCTTTCACA TCATTGTTCA 3780 

CAGTATTTAC AGTCATTGGA OGATCAATCA CCGTACTACA TGAACGTTCA ATTTTACCGT 3840 

CAATCTCAAC AGTACATGTA TCACATGTTT GAATTGGTCC CATCGACTCG TTATAACAAA 3 900 

TTGAAGGTAC AAAAGTATCT TGTGATTTAA TAAATTCAAG TAAATTCGTA CCTGGTTCTA 3 960 

CAAGATAATC TTTTCCATCA AGTGTAACCA CCAAATGTTC TTGCATATTA CTCACCCCGT 4 020 

CTATATATAT TTTCCGTAAA TGACTTTTAA TAAATTGCTC ATATCCACCT AAAATAACGA 4080 

TGCCCCACAC ATCTTTCAGA TAGAATTAAT TTAATTGTAT TACTTTATGT ACTAGTTGTT 4140 

AAGTAAAATT TTGTATTTTG CCTTTTTACA ATCATTTTTA TTTGAAATAT TTTGCGCGAA 4200 

ATTAAATCAT CTTTTTGTTT AATTGAAAAT AATTATCATT ATTAGTTTTC CAATTATCTG 4 260 

TTTCACGCTT TTTGCCATAT CTTTCACAAC CTTATTAATG ACAATATTTA ATAATCACCT 4 32 0 

20 CACCTAAAAA TCGTTATACT ATTTATAAAT ACCCTTTTTC TGAAAATTAA TAACCCAAGT 4380 

TTGATAAATA TCTACTATCA TTTAGAAGGT AATATTTATC TTTAAATTAA ATTTGTAATG 4440 

GATTAATTTA TAAAAATCAA ATCAGGCATT AAATAAAATA GCCCATAAAT ACAAAGTGTT 4500 

25 ATCACCTTCT ATTTACGGGC TATTAGTTCT ATTCGTTATT CTATTTACAG ATCATTCTAT 4 560 

CTAATTAATT TGTGTACAAT TTTGATAACT TATTTTCCCT TAGTTTACTA CTCTAGATTA 4620 

TCTTTTAATA ACTTAGTACT TTCAGCTTTT GACTGCTCAC TAGGAATGAA GTAGTACAAT 4680 

CCGTCACTTT GAATGCCGCC TTGACCACTC AATTGATGTT TATTAATCGT GTCATTAGCA 4740 

TCTTTATAAT TGCTTCTAAT CGTATTCAAA TCACCTAATG TTAAATCTGT TTTAACATTA 4800 

TTTTGAATTT CATTCATTAG ACTATTAAAA TGTGTAATCG ATGATGGGCT TGCAATCTTA 4360 

TTGGCCATCG CTTCAA3CAC AATTTGCTGA CGTTGTTGTC GACCAAAGTC ACCACCAGCA 4920 

CCTTCTTCTT TACGACTTCT AATAAACTTC AATGCTTGAT CACCATTTAC ATGTGTCTGC 4980 

40 TGTCCTTTTG TAAAACGAAC ACCATCAACA GTGAATGTAT CATTACTTAC TACATCAACA 504 0 

CCGCCGATGC TAT CT AT CAT ATTATGCAAA CCATCCATAT CGATTGTCGC ATAATGATCA 5100 

ATTGGCACAT TCATTAATTT TTCAAGTGAT TTAACAGCCA TATTTGGTCC ACCATATGCA 5160 

45 TAGGCATGTG CAATTTTTTC AGTAGTACCA CGGCCAACAA TTTCCGCTCT TGTATCACGC 5220 

GGTATACTTA CTATTTCAGT TTTCTTCGTT TTAGGGTTGA TAGATAAAAT CATAATACTA 5280 

tCACTACGCT CTCCGCCACC CTTTTTCTTA CGATCAGCAT CTGAATCGAC ACCAAATAAA 5340 

GCGATTGTGA ATGGATCACC ATCGTTTAAA CTCACTTTTT TATCTCTTAA TTCTGAATGA 5400 

TTGCGATCTA ACGGATTGTG TATCTTATTA CCAGTAATAA AAATTTTAGC AGCTACATAC 5460 
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GGTAGGCTCA TTTTACTTTT AGACGAACGT TTCAATCCCA CCACTCCTTT ACTATTCCTT 
ACATACTTTG TCTGTTTTCT CTATTTATTA TATAGTAAAA TAATTTTTTT ACTATACTTC 
TGTAGACGTA TAACTATTTT TTATCATTTT TTATCTCTAG AGAATATCTA TCTGTATTTT 
TGATAACCAC CATTTGCATT TAAAATTTTA AGTACCGTTT CATGACATGC TTTATTACTT 
ATAATAAAAG GTGCACCCTT TAAATGATCA ATTGCCTTAC CATCTAAAGT CGTCATTTTT 
AGATTCAATA GTTCTGCAAA TAAAAACTGT GCAGCAATGT CCCAAGGTTT AGGATTTGTA 
TTAATATGTG CCCCAAATTG ACCTTTTGCC ACTCGCATAG AATCTAATCC GCAAGCACCA 
ACTAAACGAT AACTAAATGA GGCGTCAAAT AAATCTTGCA CCGTATCTAG ATTCATCACT 
TGTGCATTAA ACGATATAAT AGCGTCTTCC AATTTTAACG ATGGTGGTTC TTCCATCTTA 
ATTCCATTAC AAAAAGCACC TTCTCCTCGT ATTGCTTTAT AAAGCTTTTT ATGCGGATAA 
TCATATACGT ACGATAACAT TGGTTTACCT TCATAAAAAT ACGCCAATAT AATACAATAA 
TCTTCTTGCT GTTTTACTAA ATTGGCAGTT CCATCAATGG GATCCATAAT CCATAAATGA 
TTAATTTCAT TCG7AATCAT TTCATTACTT TTTTCTTCCG CTAATAGTTG GTGTTCCGGA 
AAATGTGTTG CTAAAAATTG TTGGAATTGT TGTTGAATCT GTTTATCTAC ATTTGTAACT 
AAATCAAATC GATGACGCTT AGTTTCTGTA GTCATTTCCA TAATTAATTG CGGAATAACA 
TTGTCTATTT GTTTCAACCA CGAACATATT AACTTATCTA TTTGCTGTAA TGTTTTATCT 
GTCATTTCGT CCACCACTTC TCATATCATT ATCATTTTAT TATTACCCTA TATTAAAAGA 
ATCAACAATA CAACTGAAGA CTTCTTCATT TTATGCATAA AAAAATCGGC TAGTCACGTG 
CTAGCCGACA AATAGAAAGG AAAGTAAGTA ATAAATATTG AAGATGTTGT GATGTAACTT 
GAACGATTAA AAGCTATCTG TTATATAGCT CTACCCCTTT GTTTAATCGC TCCCCCTGTT 
ACAAGTAATA TCATAGCACA ATCTTTTTTA AAATGTAAGC GTTTTCCACA AAATTTTTAC 
GATTTTTTTA AAAAGATATT GAAAATGTCC TCATTGTCAC TCTTATGTTA TACTTTGTGT 
AATATATCAT CTTTTAGGAG GTGGCTGTCA TGAATAAAGC TGAAAGGCAA AATTTAATAA 
TTACTGCAAT TCAACAAAAT AAAAAAATGA CCGCTTTAGA ATTAGCTAAA TATTGCAACG 
TATCCAAACG CACAATTTTA AGAGATATTG ATGATTTAGA AAATCAAGGT GTTAAAATTT 
ATGCGCATTA TGGGAAAAAT GGTGGTTACC AAATACAACA AGCACAATCT AAAATTGCAT 
TAAACTTATC TGAAACACAA TTATCAGCCT TATTTTTAGT GCTTAATGAA AGTCAGTCGT 
ACTCGACATT ACCATATAAA AGCGAAATCA ACGCAATTAT AAAACAATGT TTAAGTCTTC 
CACAAACACG CTTAAGAAAA TTGCTTAAAC GCATGGACTT TTATATTAAA TTTGATGACA 
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ATGTGATGTT AGTAGATCAT AGGGTTGATG ATAATATTAA AGCTGAAAAC GTTATATTTA 
TTGGCCTTTT GTGTAAACAT GGACATTGGC ATGCAGTCAT TTATGACATT GCTCAAGACA 
AAACTGCCGA ACTCGAAATT GAAAATATTA TAGATATTTC GTATTCATTC GGTAAGACGA 
TTCAAACCAG AGACATATCC ATTGATAACT ATCATCAATT TTTAAACCCC ATCGATTCCT 
AAAAAACAGC AGTAAGATGA TTTTCAATTA GAAAATATCT TGCTGCTGTT CTCTATTTAT 
ACAATACTTC GTATTGAATG GnTTCGCTTT CCTAGGGTGC CGTCTCAGCC TTGGTCTTCG 
ACTGGCACTG CTCCCTCAGG AGTCTCGCCA TTAATACTAC GTATTAACAT GTAATTTTAC 
TTTGAAATAC TTAAAAAAAT AAAACACTTT GCCCAACTTA CACTACCAAT AGAAACTGCT 
GTTAGAATTC CTCAAAATGA TATTTCGCGA TATGTTAATG AAATTGTTAA AAAGATAGCT 
GATAGCGAAT TCGATGAATT CAGACATCAT CGTGGCGCAA CATCCTATCA TCTAAAAATG 
ATGTTAAAAA TCACCTCATA TTCATATACT CAATCTGAAT TTTCTGGCCG TAGAATAGAA 
AAATTACTTC ATAACAGTAT TCGAATGATG TGGTTAGCTC AAGATCAAAC ACCTTCTTAT 
AAAACTATTA ATCTTTTTAG AGTGAATCCT AATACTGATG CGCTAATTGA ATCTTTATTT 
ATTCAGTTTC ATAATAAAAT GCATATCAAA AAAGCTGATT TCTATCAAAT AATTAATAGA 
AATCAGCTTT TTTCaTTGCC TAAAAACTTA ATGTCCCGAC CTCTTTATCT ACGCATAAAT 
ACTTATTACT GATATAACGA AAGAAACAAA ATTATTTGCT ATATGTAATG CAATTGTTGA 
ACCTAGGTTT CTTCCAGATT TTAAATAAGT GAAAACTAAT ATGATGGATA GTATGAGATA 
TGGACCAAAC TCAAACGGCG ACTTTGCATC AGTCACATGA ATAAATGCAA ATAAGAACAC 
CGAAACAATA CTCATAGCTA TAAAATTAAA CTTCTTACCT AATTCTCCAA TTAAAATATG 
TCTAAATACG ATTTCTTCAA CTATTGGACC TACAATCACA ATTAATAAGA ATGCTACAGG 
TAAAAATGCA GGCACTTCAA ACATTTTATT TAGCTCAAGT TCATTGGCTG TTtCACTATA 
TTGCAAATGT TTAGGTAGAA ACTGTGTCAT ATATTCATAT GTATAAATTA AGATGAGAGC 
AATAATATAC GTTATTGACA ATCTAAGCCA ATATTTTTTG ATATACGCAA AACCAGCTCG 
AAGCCTTGAT GGCATCACTT TTAAATGAAA TAAATAAAAT GCGCCAATCC CAATCGTATA 
TGCTAAAGCT TGTGTGATAG TCGCTACAAA TATCAGATTA CTATCGATTT CATAATAACC 
AAACAAAATT GGTCCTATGT AAGCTGCAAT TGTGAGTGCA TAAAATATAA CACCTATAAT 
TGGAATTATA AGCAAATCTC TCCATGCTAT ATCTTTAAAC GTGTATTTCT TTTTTTCATT 
TTCCaCTGTT ATATCCtTTC CTGTTTAATA ATTGATTTTT GGAGGTACTT CTACATGATA 
AACGAAACTA AGTATATGAG ACAACAAATT ACTAATTTGA TTCAAATCAT TGATACGATT 



7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8560 

8640 

8700 

8760 

8820 

8380 

8940 

9000 

9060 
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ATAGTTACTA ATGAATTGAA TAAGTTCAAA GGCTTTGAAA CATCATATAT AATAAACGAA 9180 

AATCAAGTTT CCTATTATGA AATTATAACA CTACTTAATA AACGTCCCCT CgACAAGTCG 9240 

5 ACTATGGTAA CAAAATTCAA TATCTTAATT TTTATCATAC AGAACTATCT AACGCATTAT 9300 

TTGCAATTAA ATTTGCCCAT TAACCTATTT TTCATAAAAT GTCATTTAAA CAAGTTATTT 9360 

ATTAAAATTC ACTTTATTAC ATAAATTATA CAATTArAAA GTTTCTTCAA ATTGTAAAGA 9420 

10 

TGCATTAATC GAGTTATAAT CATAATGATT AAGATGGT 9458 
(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 910 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

AnGCGTATCA TGTCACGCAT TTTAACTACT TCTTTACCAC AAGATTATAC AGTCACATTA 60 

25 GTTGATCGTA TGCCATTTCA TGGATTGAAA CCAGAATTTT ATGCTTTAGC TGCGGGCACG 120 

AAATCAGATA AAGATGTTCG TATGAAATTC CCTAATCATC CACAAGTGAA TACAGTTTAT 180 

GGTGAAATTA ACGACATAGA TTTAGATGCT CAAATTGTCT CAGTCGGTAA TTCTAAAATT 240 

30 

GATTATGATG AGCTAATCAT TGGTTTAGGA TGTGAAGATA AATATCATAA CGTTCCAGGA 300 

GCCGAAGAAT ATACACATAG TATTCAAACA CTCTCAAAGG CTCGGGATAC TTTCCATAGT 360 

35 ATTAGTGAAC TACCAGAAGG TGCTAAAGTC GGTATCGTTG GTGCTGGATT AAGCGGCATA 420 

GAACTTGCCA GCGAATTAAG AGAAAGTAGA TCAGACTTGG AAATATATCT TTATGACCGT 480 

GGGCCGCGAA TTTTAAGAAA TTTTCCAGAA AAATTAAGTA AGTATGTTGC GAAATGGTTC 54 0 

40 GCCAAAAATA ATGTTACCGT TGTTCCAAAT TCAAATATTA ATAAAGTTGA ACCTGGTAAA 600 

ATATATAACT GTGATGAACC TAAAGATATT GATTTAGTTG TATGGACAGC AGGAATTCAA 660 

CCTGTTGAAG TTGTTCGTAA CTTGCCGATT GATATAAATA GTAATGGACG CGTGATAGTT 720 

45 

AACCAGTATC ATCAAGTACC AACATATCGT AACGTCTATG TAGTTGGTGA TTGTGCTGAT 780 

TTACCACATG CGCCAAGTGC TCAGTTAGCC GAAGTTCAAG GTGATCAAAT TGCCGATGTG 840 

CTTAAAAAGC AATGGCTAAA TGAACCATTA CCTGACAAAA TGCCGGAACT AAAGGTACAA 900 

SO 

GGTATCGTTG 910 
(2) INFORMATION FOR SEQ ID NO: 116: 
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(A) LENGTH: 10182 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 





(xi) 


SEQUENCE DESCRIPTION: 


SBQ ID NO: 


116 : 






10 


TTTTTGATTC 


AAAGTGGTGA 


TTTAACAAGC ATTTTAAATA GCAATGATTT GAAAGTCACA 


60 




CATGATCCTA CCACTGATTA 


TTATAATTTA TCTGGTAAGT TGTCGAACGA TAATCCAAAC 


120 




GTTAAACAAT 


TAAAACGTAG 


ATATAATATT 


CCTAAAAACG 


CATCAACAAA 


GGTGGAATTA 


130 


15 


AAGGGAATGA 


GTGATTTAAA 


AGGCAATAAT CATCAAGATC AGAAACTTTA TTTTTATTTT 


240 




TCAAGTCCTG 


GAAAAGACCA 


AATCATTTAT 


AAAGAAAGCC 


TTACTTATAA 


TAAAATAAGT 


300 


20 


GAACATTAAT ACTTATGCTG 


TAATTATAGA AACATCCAAA 


TCAT CTATT A 


nAATCCTATA 


360 


TTATAAAAnC 


ACCTCACATA 


ACTCGTTCAA 


CTGTACCAAA 


CCACATTACA 


TTAGATTTTA 


420 




GGCTAACTAT 


TGTGATGTAC 


ATCAAAAACG 


AATTTGTGAG 


GCGTTGTATA 


TTTTACAAAG 


480 


25 


GTGACTAGCG 


TTTCGTATAG 


CATTTCCAAC 


ATTACTACAC 


TCAAGCGTCA 


CGCTAAAGTT 


540 




CGAAATCGAA 

* 


TCCTTTCATT 


CAACAAAAGC 


TCATATCCAC 


TACAAACTTC 


ATATCAAGCG 


600 




TATAAACTAT 


CTTGTGATAC 


TATCTCGATC 


ATATCTATAG 


TATGCATTTG 


TGTTCCGTTT 


660 


30 


CACTGAAGTA 


TATGTATCAT 


CAGTTAAGTA TAAACCGTCA 


TCCTTCAATG 


TTACTTGATA 


720 




AGCATATTTC 


CGTGCTAACC 


AGGCAATATC 


TATATAATTT 


TCTCCTGCGT 


TTTCATAACT 


780 




TCTTAAATCT 


TCAATATGTG 


CACTAACTTC 


AGGGaAAATG 


ATTCTAACAA 


CACTTTCATC 


340 


35 


AACCCAATAT 


TTGTCATGCA 


TCCATCGCAC 


TTGATCTGCC 


AATAAAGGTA 


ACTGCACATC 


rt 

900 




ATTGAAATAT 


AGACGAAAGC 


CGTCACTATC ATACATTTGC 


CGATATGGTA 


ATGGCTGTTT 


a ^ a 

960 


40 


TCTAATCACT AACACCTCGC 


CACCCATTAC 


GGTGCCTTCT 


CTAGTATCAT 


CACTTCCACC 


1 A A A 

1020 


CGAAGCTTCA 


TACGTTGTTG 


GGTCAACCTG 


TAGTCCATGT 


ACATCTCCAA 


TATAAGCATC 


1080 




TGGTTTATGT 


TCCATTGCAT 


GTCCATGTGC 


AATCAATGCT 


AATATTGTAG 


ATTGTGAAAA 


.1140 


45 


TTGAGGCTCC 


CATTCAATGC 


GATTAGGATG 


GCTACTATAA 


ATTCTAGGTT 


CATCTATAGC 


.1200 




CTGCTGAATA 


TCCATGCCAA 


ACACTAATAC 


ATTGATTAAT 


GTTTGCGCAA 


CACTAGCAAT 


1260 




GATACTTATG 


GCACCAGGTG 


CACCTACTGT 


TAATATTGGC 


TTCCCGTGAT 


ACATCACAAT 


1320 


50 


CGTTGGAGCC 


ATGTTACTTA 


GTGGTCGTTT 


ATATGGTGCA 


ATTTCGTTAA 


TACCACCATC 


1380 




TACTACATCA 


AAGCCATCCA 


TTGTCGTATT 


CAATAACACA 


CCGTAGCCTG 


GAATCGTGAT 


1440 




ACCTGAACCA 


TAAATCATAC 


CAATTGATGT 


CGTAAATGAA 


GCAATATTAC 


CTTCCTTATC 


1500 
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10 



15 



25 



ATCAGACACA ACACCATGCT CTATATCAAT ATTTGCTTTA TTGCTATCAA TGAGCGTACT 1620 

GCGTGCTTTT AAATAATCAT CATCAATTAA TGACTGTACA GGCACCTCAT GAAAATTATC 1680 

ATCCGCCAAG TATTGCGCAC GATCACTATA TGCTAAATGC ATCGCTTGTA TCAAATGATG 1740 

CAAGTAATCA ACAGATCTTG GACCCATAGA TGGTAAATCG ACATGTTCTA ATAACTTCAA 1800 

TATTTGAATT ACCGTGATAC CGCCAGAACT AGATGGTCCC ATTGaATAAA TGTCATAGTC 1860 

TTTAAATGTT GCACTGATTG GCGCTTTAAT CTGAATGTCA TATTTGGCTA GATCCTCTAA 1920 

AGTGATTGTC CCACCACATG CTTTGACAAC ATTGACTAAT TGTTTCGCAA TGTCACCTTT 1980 

ATAAAATGCA TTAAACCCTT GTTCTCTTAA TATTTGAAAT GTCTTACCTA ATTCGGGTTG 2040 

TACAATCCAA TCACCTTCAC GCCAATATTG ATTTTCATGC GTAAATACTT GTGCCGTTTC 2100 

ATGATACTTT GTCAATCGTG CGTGTTGCTG GCGCGAATAT TTTTCAGTAG CCCAATTGGC 2160 

20 TGCATGACCT TCAATGGCTA GTTCAATTGC AGGATTAATT AAATCTTCCA ATGACAATTT 2220 

AGCATAACGC TTGTGAATAT AATCAAACAG CTTTGGAATT GCTGGCACAG CGACAGTTTT 2280 

ACCATGTGTA GTCATATCAA AAAATGATTT ATATTCGCCT GAATCATCTA GATAAAATTG 2340 

TTTGTCTACA TGTTCAGGTG CTGTCTCACG TGCATCAAAC G CAGTT AT AC TGCCAGTACT 24 00 

TTGCTCATAA TATAGCAAAT ACCCGCCACC ACCAATACCT GATGCAAATG GTTCTACCAC 2460 

ATTCAATGCC AGTTGAATTG CAATCACTGC ATCCATGGCG TTGCCACCTT GATCTAATAC 2520 

ATCCTTACCA ATTTTAGCCG CAAGAGGATG TGATACGGAA ATTAACCCTT CTTTAGATGT 25 80 

TTTTGTCTGT TTGTCATTTA AG7TAATGAC CATACTATAT CCTCCTACTT TCTGTTAAAT 2640 

ATTTAAAACA TTATTGATTA ATGGCTTTTT CTACTTTTTC TAAATCTTGA CGTTGCTCGT 2700 

TACCAGTATC GACAAGTGGT GTAATCGGTG ATGCAATTTT AAATTTATCG CCACGATAAA 2760 

ACTTAATAAA TTGATCCTGA TCTATCGCAT TAACTACTGC TTGTCTCAAG TTTGGATGCG 2 820 

40 TCTTAAATAT ACCTTTTTTA ATATTTAGCA TTAAAAAGAC TGACTTGCGT CCATTTTTGC 23 30 

GAATAATGCT TAAATTTTTA TCCGACTTAA TTAAATCAAA ATGTTTTTGA TTCACATCTG 2940 

CCAACATATC AATTGAATGA TTTCTAAGTT CTGACAATGC ATTATTCGGG TCACCATTAA 3000 

ACTTCAATGT AATATTTTTA ATTTTAGCTG GTCCATAACT ACCTTTTTCT GTTTCGTTGA 3060 

ATCCTGGATT ACGTTGAAAC GTTGCTTGAT ATGCATTTTT CTGTGTCATA ATGTATGCGC 3120 

CACTTGCATA CAGCGCATTT TTCCCATCTG AATTTGCAGG AATTGTACTG CTATCCCCAT 3180 

ATCCTTTTGG ATATTCTTGA TTTACTTGAT TAACAAATTT TTTAGATAAA ATGCCTGCCG 324 0 

AAGAGTGTGT TAAGTAATTT ACCTCTCGAG GCATCGATTG ATCTGTCGTA ATTTTAACAA 3300 
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TATAAGCTTT AATCAACTTA TCATAGATTG ATTTATCGTC CTTGTCTTTC TCTTTACGCA 3420 

ACTGATCGAT GTCCTCATCT TTTAATATCT TGATGTCATT TATATGTTTG TGCATATTGT 348 0 

AAGTATTATT GTTAGGCACA GACTTTTTAT CACGTGCTCT ATCTAAAGAA AACTTAACAT 3540 

CTTCAGCCGA TACACGCTCT CCAGTATTAC GTGCTTGTCC ATTGACCACT TTCGCAAAAT 3600 

AATCATCATC TCTTAACAAG AAATAAAATG CTTTATTGTC CTTATTCACA GCATAATCAT 3660 

GACTTAACGA ACCTTTCGTT GTTAAATGAT CATTTTCATC TAATAATAAT AACCTTGTGT 3720 

ACATATTCAT ATTAATTGAA TATACTGACG GCGCAATTGA ACGTATTGGA TCCAATGTAG 3780 

GAATTTCACC ATCTTGTTGT GTCATCACAA GTGGCCGCGT ATCTCGTTCT CTACTATTGT 3840 

TGTAATCAAA TTGTTGCCAT ATTAATGCAC GTGAATTTGG CAATCCAACA CTATTTTTAT 3900 

CTAACACTTT ATTGTCATAT ACTAAATTCT TTTTTGATCC ATATAAAGGC GCCATATACC 3960 

20 CTTTATCAAA TACAACTTCA TCTTCAATTT GCTTATATGT TTGTTTAACA TCTGCTTCAT 4020 

TTTGAGTAGA AGCTTTATTT AACAACTGGT CTACATGTTT ATCTTTCAAT AAACTATTTG 40 80 

ATCCTGTAGA ACTAAATAAT GCCGTCATAG CATAGTTCGG GTCACCAAAC ACTGTCATCC 4140 

AGTCATCAAT TTGGATATCA TAATTGCCGG CTTGACGTTG TGTACGATAG CTAC CAT AAT 4200 

CTGGTTGGAT ATTCATCTTC ACGTTAAATC CTGCATTTTC CAATTGATCT TTAACGATAT 4 260 

TCATATCATT TTCATAACTT GCTTGTCCTA GGAAATGTAT TGTTGGTCGC TCGCCTTTCA 43 20 

CTTCAACTTT CGATGACTTT TGAGCCACTT CTGATTTCGT AGGGACACCA CAACCACTTA 4380 

ATACCAACGC TAAAACTATA ATTGCGATAC TAATGATTTT CTTCACATCT ATCCCTACCT 4 440 

TTTTAATGAA TTCTTGGATC TAGTGCATCA CGCACTGCAT CACCTATAAA ATTAAATGCT 4500 

AAAACGACGA ACATAATACA AACACCAGGT ACAATAGCTA AATTACTGTG CGTTTCCAAG 4560 

TAGT^ACTAC CGG7ACGTAA AATGTTGCCC CATTCAGCTA CATCAGGTGC AACACCAAGT 4S20 

40 CCTAGGAAAC TTAAACTACT TGTTGTTAAT ACAACCACAC CTATATTTAA TGAAAAACGT 4580 

ACAATCATAG GCGCAATCGC ATTCGGTAAA ATATAACGCC ATATGATATT CCAAGTGTTT 4740 

TCACCAGTGA TACGTGCTGC ATCTACATAT TCCATGCGTT TAATTTCTAA AACACTGGCA 4 800 

CGCATTGTCC GTGCAAATGA TGGTATATTA CCGATACTTA AAGCAATAAT TAAATTTGGA 4860 

ATACTTGCTC CAAATGATGC AATAATTGCC ACCGCTAACA ATAATGATGG AATTGCAAAC 4920 

ACTACATCTA AAATTCGCAT TATTAAATTA TCAATATGAT TAAAATAACC TGCGATAGTG 4 9 30 

CCTAGTAACA CACCAAAAAT AACTGCAATA ACTACTGAAA TAATTGAAAT TGAAAATGTC 5040 

AGCTTCGTTC CTACAACTAC GCGTGTAAAT AAGTCTCTAC CGAAATCATC AGTACCAAAC 5100 
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GTATCAAATG TAAATTGTGA CACAATTGAT AATGTCAGCA TGTAGACTAA AATAAGTAAC 5220 

CCGATAATCG CAATACGATG TCTAGTAGTT TTTCGTATAA ACGATTCCCA CCCGTTATAA 5280 

CTATGTATTT GCGATGTACG TTGGTAACGT CTAATACTTA CAAACATTAA TAATGTAAAT 5340 

ACGTTGCCTG TTAATGTCAT CAACAATAAC AACACTTCGA CGATACGTCG CCATAGGTCA 54 00 

TGATGCTTCC ATGTTTGTTC CGTTGTTAAA ATAATAATTA AAATGATGGT TAAAACGATT 54 60 

AGCAATGTTT CAGCAATATA GAACGTATCG GCCACATAAC CTTTAAAAAG ATTTAATGCA 5520 

CTCGTTAATA TAACTAAAAT AT AAGTTG CT ATGGCGTAAC TTGCGAATAA TTTTAAGGAA 5580 

GCTATCTTTG AATTAAGTTG TGCCATATGC CTCACTTCCT TTCGTTGATT TCACTACGTA 5640 

ATTTTGGATC GATTAAAGCA TAAAATATAT CAATAATTAA GTTTGCTAAA GATATTACAA 5700 

TTGATATATA TACGACCCCA CCCATGACTG CTGGAATATC AGGTATTAGT TGTTTTTGGA 57 60 

20 CGATATAACG CCCGATACCA TTAATGTTAA ATACTTGTTC CGTCACTGCT GAACCGCCTA 5820 

GTAACTCTGC CACTAGAAGA CCAACTAACG TTACAATTGG AATAATGGCA TTTTTCAAAA 5880 

TATGTTTAAT AACAACTTGT GTCGTCGATA ATCCTTTTGC ATAAGCAGTT AAAACATAAT 5940 

25 CGctGCGCAT TACTTCAAGT ACAGAAGACC TTGTCATACG CGTGATAGAA GCAGCAATAC 6000 

TTGTTCCAAT GACAAGTACA GGTAAAATCA ACGATATTGG ATGTTCTGGC ATATAAGATG 6060 

GTGGCAAAAT ATCCAATTTC AATGAGAACG CTAAAATGAA TAATAGCCCT TGCCAGAAAC 6120 

TTGGAATAGA TAAACCAATT AATGCAATTA TCATTAACGT GATATCAAGC CAACTATTTC 6130 

GCTTCATCGC ACTGATAATA CCAATTGGTA TTGCAATAAT TAATGCCACC ATTAGCGCTA 6240 

ATACTGCGAC AATTATTGTA ATTGGAATTC TTTCGCCAAC TGCTTTAGTC ACAACCTCAT 6300 

TCCCTTTGTA AGTCGTACCT AAGTCAAAGG TAAAAACACC CTTGATGGTA TCCCACAATT 63 60 

GAAT5AAATA AGGTTCGTTA AGATGATGTA ATACATTGAA TTGATGTATC TGTGCCTTTG 6420 

40 TTGCSATTTTG TCCCAGTATG CTATAAGCCG CATCAAGCGG TGAAAAATAC AGAATGGTAA 6480 

ACACACTGAC AATAACACCA ATGATGACAA TCACAGCCAT GACAATTCGT TCAAAAATAT 6 54 0 

ATCTAACTAA TGGCTGTAAA TAAAAAGTCA ATAAGATGAA CATCGGCAAG GCCAATATCA 6600 

45 CTTTGATCAT GATGAACTTA TGAAATAATA CATTTTCAAA GTATGTTGAA AAATGTGCTT 6660 

GTTCAATATT CTTTGAACTC GTATTAGAAC TTTGTGCCTT GAATATTTTT AATGCTTCTT 6720 

TATGTATTTG TGTGGATGAC TTTTGCTGCG ATAAATATTT ATATTTTTGA TGTAACGCCT 6780 

SO 

GTTCAATTTC TGAAATTTCA GAATTATTAG CGTAAAAATT TTTCCTCTTA GCAGAAAAGA 684 0 

AAAACTTTAT CACTGCATAT AAAAATATTG GCAAGCTTAA TACCGATAAT ACAAACTTGT 6900 
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CTTGTAAAAT AATCTTGAGT AGATTACTAT 
ATTTGTGaAT AGGGAGGCAC AACATCATGT 
5 TACAATTCAA TTATGATGAA ACTACAGTTC 

GAAAAAAACA TATCCTAGGT ATTGTTGGTG 
AATCTATTTT AGGGCTACTA CCAGATTATC 

10 

TTAATGGGCA ATCGTTAAAT AATTTATCAA 
ATATTTCAAT GATTTTTCAA GATCCACTCT 
AACAAATTAC AGAAGTAATA TTTCAACATA 

15 

TGACAATAGA CATTTTAGAA AAAGTAGGTA 
ATCCACATGA ACTTTCTGGT GGTATGCGTC 

20 TAAAGCCACA AATTTTAATC GCAGATGAaC 

ATCAATTACT GCAGTTAATG AAGTCCCTTT 
TCACTCACGA TTTAGGCGCT GTGTATCAAT 

25 GAAGTGTCGT TGAAAGTGGC ACGGTTGAAA 

CAAAACGCTT AATAGATGCG ATTCCTGATA 
ACAATGATAT TTTATTAAAA TTCGATCGCG 

30 

CCTATACCGA GCAGTTAATG ATATTAACTT 
TGTCGGTGAA TCAGGGTCAG GGAAATCGAC 
AGTGTCAGAA GGCTTTATTT GGTATAACGA 

35 

ATTGAAATCT TTACGACAAG AGATACAAAT 
TCCAAGATTT AAAGTCATTG ATGTGATTAA 

40 AGATAATGAT GACATTATTA AAACTGTCGT 

AACTTTCTTA TATCGCTATC CACACGAATT 
CGCGAGAGCA CTTGCTGTTG AACCTAAAGT 

4 ° AGACGTTTCA ATTCAAAAAG ATATCATCGA 

CATCACTTAT TTATTCATCA CACATGACAT 
TGCAGTTATG AAAAATGGCG AAATCGTTGA 

SO 

TCCGCAGTCA GACTATGCAA AGCAACTTAT 
GTCATGCGTT GTGCAACTTT ATCACTGTAT 

55 



GATATACAAA AGTATAGAAT 


AAATTTACAC 


7020 


CAAATTTATT AGAAGTCAAC AGTCTGAATG 


7080 


AAGCGGTAAA AAACGTCTCT 


TTCGAATTAC 


7140 


AATCAGGATC AGGAAAAAGT ATTACCGCTA 


7200 


CAGATCACAC 


ATTAACAGGA 


GAAATTATTT 


7260 


CTTCAGCGTT 


ACAACAAATT 


CGAGGTAAGG 


7320 


CTTCGTTGAA 


TCCAAGATTA 


ACGATTGGCA 


7380 


AACGTGTATC 


TAAATCTGAA 


GCAAAGTCGA 


7440 


TAAAACATGC 


AACTCGACAA 


TTTGATGCTT 


7500 


AACGTGTCAT 


GATAGCAATG 


GCATTGATTT 


7560 


CAACAACGGC 


ATTAGATGCC 


AGTACACAAA 


7620 


ATGAGTACAC 


AGAAACATCT 


ATTATTTTTA 


7680 


TTTGCGACGA TGTGATTGTA ATGAAAGATG 


7740 


GTATTTTTAA ATCGCCACAA 


CATACCTATA 


7800 


TTCATCAAAC GCGTCCGCCA AGACCGTTAA 


7360 


TGAGyGgGAT TACACATCAC 


CGAGTGGCAG 


7920 


GGCTATTAGA 


AAAGGCGAAA 


CATTAGGCAT 


7980 


ATTAGCTAAG 


ACGGTCGTCG 


GTCTAAAGGA 


3040 


ATTACCATTA 


AGTTTATTTA 


AAGATGATGA 


3100 


GATTTTTCAA 


GATCCATTCG 


CATCTATTAA 


8160 


ACGACCACTA 


ATCATTCATG 


GGAAAGTCAA 


8220 


ATCGTTGTTA 


GAAAAGGTTG 


GCCTAGATCA 


8280 


ATCTGGTGGG 


CAACGTCAGC 


GTGTAAGTAT 


8340 


GATTGTTTGC 


GACGAGGCAG 


TGTCCGCTTT 


8400 


GTTATTAAAA 


CAATTACAGT 


TAGACTTCGG 


o4oU 


GGGTGTTATC AATGAAATAT 


GTGATCGCGT 


8520 


ACTGAATAAC ACAGAAGATA TTATCAAACA 


8580 


TTCAGAAGTA 


GCAGTTATTG 


CTAAATAAAA 


8640 


GGTCTGAAAT 


AAATTGCGCG 


ACTTCTGATG 


8700 
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TATCAAGTTT TAGGTGCTTT GCCATGATTT AAGAGTCACC CCCATACTTT GGGCATTTTA 8820 

ACGCCAGAAT AAATCCCCCG CCACTATGTG AAGTGTGGGG GATTATTTAT ATTTTATTAG 8880 

5 

AATATTCAGA TTTTTGAGTG TGTCAACTTA GCTTAGTCAA TGTATATTTA ACGTCACTTA 8940 

CTCTTTTTCT TTCATAATTA ACACATTCAA ATAAACTTTG ATCAAAAAAC ACAAAGTTAA 9000 

AAGTACCATC TTGTAATATG CTCTCATACA TTATCCCGTC ATATTTAAGG CTTCGAATAT 90 SO 

10 

AATCAGCTAA ATATTGAAAT GGCAAATAAT CTATTCCTTG TTCATCGCTT GGATTTGTTA 9120 

TTCCTTTATG AATCTTTTTT AATGTTTGGT AATTTACAAA ATACTTTCTA AATCCATCAT 9180 

15 CGCCAGCTTT GATTGCATTA CTAGTTAAAT TAGTTAAATT CGCAATTTTC AATTTCTCTT 924 0 

TTGTCACGTT TTTTTGTAAC TTAACCTTAC CTATATAAAT AATGTCATTA TGCTTAGGTT 9300 

TAACTTCTTC TATACTGACC TGTTCTTTTG TACTAAGGTA TAATACGCTT ATCCATTTAG 93 60 

20 AATTCAATCT TCCTGCCGTT GCAAATCCCT TTGGTGGTGA CATTAGTTCA CTTTTCTCTG 9420 

TAATGAACTT AACTATTCTA GATCTATATA ATGGTTCAAA TCTTTCTCTA AATTCCTCAA 9480 

TACTATAGTA ATTAGTAGTG ATATCGAGAA AGAACGCTAA ATTCTCTAAA TTGATCATAT 9540 

25 

TTTTATGAAA TCTATTTTTA TACTTCAAGC TCTCACAAAA TCCATCCCAG TCATTATTTG 9600 

CTACAATTAG ATTTTTATTT GTATATTTTT TATCGTTTAT GATTTTAGCG CCTACTAAAT 9660 

CTTCCAACAC TCGTCTATCT AAATTTTCAT CATCTTTAAA AAGTTCATTT AAAATACAAC 9720 

30 

TTATTTGAGC TTCCTCAACA TTAAATATAC TCCAGTCGTC TTTTAATGCT ATTTCAATCT 9780 

TTTTACCTTC TTTTGGGCTA AAAGTATCTG GTAAATTTAT ACTAATATCA TATAATTCTA 984 0 

35 ATGCTGGTCT TAAATAATCT CTAATAAGTT CTAATTTATC TATGTCCTTA GTCGTATCAA 9900 

ATATTTTAAC ACCAAGATGA TTGTTATCAA TATCACAATT GTCAAATTTG CTATTTATCA 9960 

TTTGCAATGA TTTCTACGAT TTCAGTATTA TTAAAACATT TTTCACATAT TTTCATTTTG 10020 

40 AGACtCCAAG TATCTATTCA TAATTTCTAG GTGATGCATG AT AG AT AAC C TTTTAATTAA 10080 

ACCTAATCCT GGATaCTTAT TATTTTCATT TAATTCTTCA AATTGTCCCA AGCGCATAAG 10140 

ATCTATTTTT AATATCTAAG TTTTTTGACC ATGTTACTAA TT 10182 

45 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3491 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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AACTCAGGCA ATTGAAACAG CATTAGGTGC TTCATTACAA CATGTCATTG TAGATTCAGA 60 

AAAAGATGGA CGCCAGGCTA TTCAATTTTT AAAAGAACGT AATTTAGGTC GTGCGACGTT 120 

TTTACCATTA AATGTTATAC AGAGTAGAGT GGTAGCGACT GATATTAAAT CTATTGCTAA 1B0 

AGAGGCAAAC GGATTTATTA GTATCGCTTC GG AAGCAGTT AAAGTAGCAC CAGAATATCA 240 

AAATATTATC GGGAATTTAT TAGGTAATAC GATTATCGTT GATCATTTAA AGCATGCAAA 300 

TGAATTGGCA CGTGCGATTA AATATCGAAC TCGTATTGTT ACTTTGGAAG GTGATATTGT 360 

AAATCCTGGT GGtTCTATGA CTGGTGGTGG CGCTCGTAAG TCAAAAAGTA TTCTGTCTCA 420 

AAAAGACGAG TTGACAACAA TGAGACACCA ATTAGAAGAT TACTTGCGTC AAACAGAATC 4 80 

ATTTGAACAA CAATTTAAAG AGTTGAAGAT AAAAAGTGAT CAATTAAGTG AACTGTATTT 540 

TGAAAAAAGT CAAAAGCATA ATACACTTAA AGAGCAAGTG CATCATTTTG AAATGGAGCT 600 

20 CGATAGATTA ACTACACAAG AAACACAAAT AAAAAATGAT CATGaAGAAT TCGAATTTGA 660 

AAAAAATGAT GGTTATACGA GTGACAAAAG TCGACAAACT TTGAGTGAAA AAGAAACTTA 72 0 

TCTAGAAAGT ATTAAAGCAT CTTTAAAACG ACTAGAAGAT GAAATTGAAC GCTACACAAA 7 80 

ACTTTCTAAA GAAGGTAAGG AAAGCGTTAC TAAAACACAA CAAACCTTAC ATCAGAAACA 840 

ATCTGATCTT GCTGTGGTTA AAGAGCGTAT TAAAACACAA CAACAGACAA TAGATCGATT 900 

AAATAATCAA AATCAACAAA CTAAACATCA ATTAAAAGAT GTTAAAGAAA AAATTGCATT 960 

CTTTAATTCG GATGAAGTGA TGGGCGAACA AGCTTTTCAA AATATTAAAG ATCAAATTAA 1020 

TGGTCAACAA GAAACGAGAA CACGCTTATC AGATGAATTA GATAAATTGA AACAACAACG 1030 

TATTGAGTTG AATGAACAAA TCGATGCGCA AGAAGCTAAA CTACAAGTTT GTCACCAAGA 114 0 

TATTTTAGCT ATCGAAAATC ACTACCAAGA TATTAAAGCT GAACAATCAA AGCTAGATGT 1200 

ATTAATTCAT CATGCGATAG ATCATTaAAT GATGrATATC AATTGACTGT TGAACGTGCG 1260 

40 ArATCTGAAT ATACGaGTGA TGrATCGATg ACGCATTACG TAAAAAAGTT AAGTTAATGr 1320 

AGaTGyCGAT TGATGrACTA GGTCCTGTAA ACTTAAATGC AATTGAACAA TTTGAAGAGT 1380 

TAAATGAACG TTATACATTT TTAAGTGAAC AACGTACAGA TCTTCGTAAA GCTAAAGAAA 144 0 

CATTAGAGCA AATTATAAGT GAAATGGATC AAGAGGTTAC TGAAAGATTT AAAGAAACTT 1500 

TCCATGCTAT TCAAGGACAT TTTACAGCTG TGTTCAAACA ATTGTTTGGT GGAGGCGATG 1560 

CAGAATTGCA ATTAACTGAA GCCGATTATT TAACAGCTGG TATTGATATT GTGGtACAAC 1620 

CACCGGGTAA AAAGTTGCAA CATTTATCGT TACTGAGTGG TGGTGAGCGT GCATTAACTG 1680 

CTATTGCTTT ACTATTTGCA ATTTTAAAAG TAAGATCTGC ACCTTTTGTT ATATTAGrTG 174 0 
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TATCAGACGA AACACAATTC ATTGTTATTA 
ATAGGTTATA CGGTGTAACA ATGCAAGAAT 
5 TAAATACAAT AGATGATGTG TTGAAGGAGG 

AGATAAGTTT GCAACAAATA AAGAAAATGA 
TCAAGACAAA TTAGAAGATA CACATTCTGA 

10 

AGAAAATGCT GAAGTGAAAA AGAAGCCACG 
TGGCTTAATA TCAATTGAAG ATTTTGAAGA 

15 TAAAGCAGGA CTCGAAAAAT CTCGTCAAAA 

GAGA TAT CGT AAAGTAGATG AAGACTTTTT 
AGACGTCGGT TTTAATACAG TGATGACGTT 

20 ACGTAATATT CAAGATACTG AAGATTTGCG 

TTACCATCAA GAAGATkATA ATTCAGAAGC 
CATTTTAATG GTTGGTGTGA ATGGTGTTGG 

25 CCGATATAAA ATGGAAGGTA AAAAAGTAAT 

TGCTATTGAT CAATTGAAAG TTTGGGGCGA 
TGAAGGTTCT GATCCAGCTG CTGTTATGTA 

30 

TGTTGATATT TTAATCTGTG ATACCGCTGG 
AGAATTAGAA AAAGTTAAGC GTGTAATTAA 
ATTACTATGT TTAGATGCTA CAACTGGTCA 

3d 

AGAAGTAACA AATGTTACAG GTATTGTATT 
TATCSTATTA GCCATTCGTA ATGAATTGCA 

40 GCAATTAGAT gacttacaac catttaaccc 

TATGATTGAA CAAAATGAAG AAATAACAAC 
AAAGGACGAT AATCATGGGT CAAAATGATT 

45 

TTGATTTTaT CAATCCTTAT TGACGAATAA 
TGAAGATTAT TCTTTAAGTG AAATCGCAGa 
TGATAATATA AGAAGAACTG GCGATTTAGT 

SO 

CCAGAAATTT GAGCAACGCC GAGAAATATA 
AGAACAAATA C 

55 



CACACCGTAA AGGAACAATG GAATTTGCAG 1860 

CAGGTGTTAC TAAACTTGTG AGTGTGAATT 1920 

AGCAATAATG AGCTTTTTTA AACGCTTAAA 1980 

AGAAGTTAAA TCCTTAACAG AAGAACAAGG 2040 

AGGTTCAACG CAGGACGCAA ATGATTTAGC 2100 

CAAGTTGAGT GAAGCGGATT TTGATGACGA 2 ISO 

AATTGAAGCT CAAAAAATGG GTGCTAAATT 2220 

TTTCCAAGAA CAATTAAATA ATTTGATAGC 2280 

TGAAGCTTTA GAAGAAATGT TAATCACTGC 2340 

AACTGAAGAA TTACGTATGG AAGCACAACG 24 00 

TGAAGTCATT GTTGAAAAGA TCGTAGAGAT 24 60 

TATGAACTTA GAAGATGGTC GTTTAAATGT 2520 

TAAAACAACA ACAATTGGAA AATTAGCTTA 2580 

GTTAGCTGCG GGCGATACTT TTAGAGCGGG 2540 

ACGTGTTGGT GTAGACGTAA TTAGCCAAAG 2700 

TGATGCgATT AATGCCGCTA AAAACAAAGG 2760 

ACGTTTACAA AATAAmACAA ATCTAATGCm 2820 

TCGAGCAGTG CCAGATGCGC CTCATGAAGC 28 80 

GAATGCGTTG TCACAAGCTA GAAACTTTAA 2940 

AACGAAATTA GATGGTACAG CCAAAGGTGG 3000 

CATCCCAGTT AAATATGTAG GTTTAGGTGA 3060 

TGAAAGTTAT GTCTACGGCT TATTCGCTGA 3120 

AGTTGAAAAT GATCAAATTG TAACAGAAGA 3180 

TAGTtAAAAC GTTACGAATG AATTATTTGT 3240 

ACAACGTaAT TATTTGGAAT TATTTTATCT 3300 

TACTTTTAAT GTGAGTAGaC AAGCAGTTTA 3360 

TGAAGATTAT GAAAAGAAAT TGGAATTATA 3420 

TGATGAAATG AAACCACATT TAAGTAATCC 3480 

3491 
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5 


<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4253 base pairs 

(B) TYPE: nucleic acid 

IC) STRANDEDNESS : double 

(D) TOPOLOGY: linear 






10 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118; 






AGTACGTTTT ATAATTATAA GTACGTAATT AACATATTAA CATATCGCAA 


GTATGTATTT 


60 




AAATAAgATT GTTATAATTT CAAAGTTCAT CCAAGaTTAT GGCGTTTGCA 


TTTACCTATT 


120 


15 


AAAAACGTTA TTATATCAAA GATGCGAAAG ATAATACGGG TTTATTTTAT 


GAAAGTGAGA 


ISO 




AGGATAAAAT GGATAATGAG CAACGCTTAA AAAGAAGAGA GAATATAAGG AATTTCTCGA 


240 




TTATAGCACA TATTGACCAC GGAAAATCTA CATTGGCTGA TAGAATTTTA GAAAATACCA 


300 


20 


AATCAGTTGA AACAAGAGAT ATGCAAGATC AGTTACTAGA TTCAATGGAT 


TTAGAAAGAG 


350 




AACGTGGTAT TACAATCAAA TTAAACGCgT ACGTTTAAAG TACGAAGCTA 


AAGATGGAAA 


420 




TACTTATACA TTCCATTTAA TCGATACGCC TGGACACGTC GATTTTACAT 


ATGAAGTGTC 


480 


25 


ACGTTcTTTG GCAGCTTGTG AGGGCGCGAT TTTAGTAGTA GATGCGGCTC 


AAGGTATCGA 


540 




AGCACAAACA TTAGCAAATG TTTATTTAGC ATTAGATAAT GAGTTAGAGT 


TATTGCCTGT 


600 


30 


TATTAACAAA ATTGATTTAC CTGCTGCAGA ACCTGAACGC GTGAAACAAG 


AAATTGAAGA 


660 


TATGATAGGT TTAGACCAAG ACGATGTTGT TTTAGCAAGT GCTAAATCTA 


ACATTGGAAT 


720 




TGAAGAGATA CTAGAGAAAA TAGTTGAAGT TGTGCCAGCT CCAGATGGTG 


ACCCAGAAGC 


780 


35 


ACCACTAAAA GCGTTAATAT TTGATTCTGA GTATGATCCA TATAGAGGGG 


TAATTTCATC 


840 




GATAAGAATT GTGGACGGTG TTGTTAAAGC CGGAGATAAA ATTCGAATGA TGGCCACTGG 


900 




TAAAGAGTTC GAAGTAACAG AAGTTGGAAT TAATACACCT AAGCAGCTTC 

• 


CAGTTGATGA 


960 


40 


ATTAACAGTT GGTGATGTTG GTTATATTAT TGCAAGTATT AAAAATGTTG 


ATGATTCTAG 


1020 




GGTTGGTGAC ACCATCACAT TAGCTAGTAG ACCTGCATCA GAACCATTGC 


AAGGTTATAA 


1080 




GAAAATGAAT CCAATGGTAT ATTGCGGACT GTTCCCAATA GATAACAAAA 


ATTATAATGA 


1140 


45 


TTTAAGAGAA GCATTAGAAA AATTACAATT GAATGATGCA TCATTAGAAT 


TTGAGCCTGA 


1200 




ATCGTCACAA GCATTAGGTT TTGGTTATAG AACTGGTTTC TTAGGTATGT 


TACACATGGA 


1260 


50 


AATAATTCAA GAAAGAATTG AAAGAGAATT TGGTATTGAA TTAATTGCAA 


CTGCACCATC 


1320 


TGTAATTTAT CAATGTGTTT TAAGGGACGG TTCAGAAGTG ACGGTTGATA 


ACCCAGCACA 


1380 




AATGCCAGAT CGTGATAAAA TTGATAAAAT ATTTGAGCCA TATGTTCGTG 


CAaCTATGAT 


1440 
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TATAAATATG GACTATTTAG ATGATATTCG TGTAAATATT GTTTATGAAT TACCTTTAGC 1560 

TGAAGTTGTA TTTGATTTCT TCGATCAACT TAAATCTAAT ACTAAAGGAT ATGCATCATT 1620 

5 

TGATTATGAA TTCATCGAAA ATAAAGAAAG TAATTTAGTC AAGATGGATA TTTTATTAAA 1680 

TGGTGATAAA GTGGATGCGC TAAGCTTCAT AGTTCATAGA GATTTTGCAT ATGAACGTGG 174 0 

TAAAGCATTA GTTGAAAAAC TTAAAACGTT AATTCCAAGA CAGCAATTTG AAGTACCTGT 1800 

10 

ACAGGCTGCA ATAGGACAAA AAATTGTAGC GCGTACAAAT ATTAAATCAA TGGGTAAAAA 1860 

CGTTTTAGCT AAATGTTATG GCGGTGACAT AAGCCGTAAA CGTAAATTAC TTGAAAAACA 1920 

75 AAAAGCAGGT AAAGCTAAGA TGAAAGCAGT TGGTAATGTT GAAATTCCAC AAGATGCTTT 1980 

CTTGGCTGTA TTGAAAATGG ATGATGAATA ATTTTAAAAA ATCAATTAAC AATTTACAAT 2040 

GAATAAAGTT TAATAACTAA AAAGAGGGAG CCTAGGATAA ATTAACGTCC TGGGCTTTAC 2100 

20 AATGTTATAT TGGCAGCCAT CGACAGAGTT AAAATGAGCT TATAACAATG GGGCCCCAAC 2160 

ACAGAAGCTG ACGAAAAGTC AGCTTACTAT AATGTGCAAG TTGGGGTGGG GCCCCAACAT 2220 

AGAGAATTTC GAAAAGAAAT TCTACAGGCA ATGCAAGTTG GGGTGGGACG ACGAAATAAA 2280 

25 

TTTTGCGAAA ATATCATTTC TGTCCCACTC CCTTATGCAT GAGTTTTACT CATGTAATTT 2340 

TATTTTTAAG GACATATTAC ATCTGGCTAA TGTGTAAGAG CCACTACATA ATAAATCATT 2400 

AGTGGTTCTT TATTATTTCT ATCTCACTCC CTCTAAACAA GAATAAATAT TAAAATGAAT 24 60 

30 

CGATATATTA GACAATCATT GATTAAACGT TAAAGTTAAA AGTAAGAATA ATTGCAGATA 252 0 

GTCCAACAGG ATAT AG CCG A TTGGATAAAA AGTCTGAGAA GCGGGGCATT AAAATGACGG 253 0 

35 TACAAAGTGC ATATATACAT ATTCCATTTT GTGTAAGAAT ATGTACATAT TGTGATTTCA 264 0 

ATAAATATTT TATACAGAAT CAACCTGTAG ATGAGTACTT AGATGCACTA ATCACAGAAA 2700 

TGTCTACAGC AAAATATAGG ATCTTAAAGA CCATGTATGT AGGTGGCGGC ACACCAACGG 2760 

m 

40 CCCTTTCTAT TAATCaGTTG GAAAGATTAC TTAAAGCAAT ACGTGATACG TTTACAATCA 2820 

CAGGCGAGTA TACATTTGAA GCAAATCCTG ATGAGTTAAC TAAAGAGAAA GTCCAACTAT 2880 

TAGAGAAATA TGGAGTAAAA AGGATTTCAA TGGGCGTTCA AACATTCAAG CCGGAGTTAT 2940 

45 

TGTCTGTTTT AGGTAGAACG CACAATACTG AAGATATTTA CACTTCGGTG TTAAATGCTA 3000 

AAAACGCAGG TATTAAATCA ATCAGTTTAG ATTTAATGTA TCATTTACCG AAACAGACGA 3060 

TTGAAGATTT TGAACAAAGT TTAGATCTAG CTTTAGATAT GGATATTCAA CATATTTCGA 3120 

SO 

GTTACGGCTT AATACTTGAA CCTAAAACCC AATTTTATAA TATGTATAGA AAAGGCTTGC 3180 

TCAAACTTGC TAATGAGGAT TTAGGTGCTG ACATGTATCA GTTGCTGATG TCTAAGATAG 3240 
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GTACGTTTAG CTAAGAAGCT TTGTGAGATT GCACCTGGAG ATTTTGAAAA AAGAGTGACC 540 

TTCGGATTAA CCGGATCAGA CGCAAATGAT GGCATCATTA AATTTGCCAG AGCATATACA 600 

5 

GGGCGTCCTT ATATCATTAG TTTCACTAAT GCATATCATG GTTCAACTTT TGGCTCATTG 660 

TCTATGTCAG CTATTAGTTT AAATATGCGC AAACATTATG GTCCGTTATT GAATGGTTTT 720 

TATCATATTC CGTTTCCAGA TAAATATCGT GGTATGTACG AGCAGCCACA AGCTAATTCA 780 

10 

GTAGAAGAAT ATTTAGCACC CTTAAAAGAA ATGTTTGCGA AGTATGTACC TGCTGACGAA 84 0 

GTAGCATGTA TTGTTATTGA AACGATACAA GGCGATGGTG GACTTTTAGA ACCAGTTCCA 900 

15 GGGTATTTTG AAGCGTTAGA AAAGATTTGT CGTGAACATG GTATTTTAAT CGCTGTCGAT 960 

GATATTCAAC AAGGTTTTGG GAGAACAGGT ACATGGAGTT CAGTCTCGCA TTTTAATTTT 1020 

ACGCCTGATT TAATCACTTT CGGAAAATCC TTAGCAGGTG GTATGCCTAT GTCAGCAATT 1080 

20 GTTGGACGCA AAGAGATTAT GAATTGTTTA GAAGCACCAG CACATTTATT TACAACAGGT 1140 

GCTAATCCAG TTAGTTGTGA AGCTGCATTA GCCACAATTC AAATGATTGA AGATCAGTCG 1200 

CTTCTTCAGG CTAGTGCGGA AAAAGGGGAA TATGTTAGGA AACGAATGGA TCAATGGGTA 1260 

25 

T CTAAATACA ATAGTGTAGG CGATGTTAGA GGTAAAGGTC TGAGCATTGG TATTGATATT 1320 

GTTTCCGACA AAAAACTCAA AACACGTGAT GCCAGTGCGG CACTTAAAAT TTGTAATTAC 13 80 

TGCTTTGAGC ATGGCGTAGT TATTATAGCT GTAGCAGGAA ATGTGTTGCG ATTCCAACCG 144 0 

30 

CCATTGGTAA TAACATATGA GCAATTAGAC ACGGCGTTAA ACACTATAGA AGATGCACTG 1500 

ACTGCTTTGG AAGCAGGTAA CTTAGATCAA TATGACATA7 CTGGACAAGG TTGGTAATAG 1560 

35 CGATTATCTT AATATAAAAT AAAAAATCAT TTCCACATCT GGATGTTAAT CAGATGGGAA 1620 

ATGATTTTTT TTATTTTTTA TTTTGGTGGG TGGTATTCAG CTACGTCATT TTTCTTAGAA 1680 

TGTGTAAGTC CATAACTTAA ATATAGGATG ATACCAACAA TAAACCAAAT TAAAGTGTAT 1740 

40 AATTTCGCTT CGAATCCTAA TCCCCAGAAT ACTAGCAATA CTAAAACAAA TGTAATTGCT 1800 

GGTAACACAG GATATAAAGG TAATTTAAAT GCAGGAATTG GTAGATCTTT ACCTTcACGC 1860 

TTTCTCAAAC GATACATTGC TAATGAAACG AACATAAATG CAACAAGTGT ACCTGCTGAA 1920 

4 ° ATTAATTGTG CTAAAAATGC GAATGGGAAC ATAGAACCAA TTAAAACACC AATAATAGTA 1980 

AGTATAACTA GTGCGCGATT AGGTAAATGT TTGTCGTTTA AGTGGCTTAA CCATGAAGGT 2040 

AATAAGCCGT CACGTCCAAA TGAATAAAGT AAACGTGAGC CTGCTAACAT CATACCAATT 2100 

SO 

AATGCTGTAA ACATACCGAT AACAGAGATA GCTTGAACAA TAGCTGCTAC AACACCATGA 2160 

CCACTTTGAC GTAAAGCCCA ACCAACAGGT TCAGCATTGT TTGCGTATTG TGAGTAATGG 2220 
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CCAAGAATAC CTCTAGGCAT TGTCTTTTGA GGATCAAGTG CTTCTGCTGA GTTTGCTGCG 
ATAGAATCGA AACCGATATA CGCTAAGAAA ATCATTGAAA CACCAGCATA TATGCCTTGC 
CATCCACCAA AGTCACCTGT AGCAGTTACT TTGTGTTCTG GAATAAATGG CACATAGTTA 
CTAACATTTA TTGCTGTTAA ACCTACGATG ACAAATAAAA TAATAGCTAA TACTTTTAAA 
ATAACTAAAA TATTTTCCAT ACGAGCTGCT TCCGACATAC CACGTGATAG TAATAATGCA 
GTTAATAAAA TAACGATAGC AGCAATAATA TCGATAAAAC CGCCATTTGT ACCAAATGGA 
TTTGATAATG CTGCAGGTAA TTCGATGCCA ATTGGTTTCA CAAGTCCGCG TAAATTCGCT 
GAGAATCCTG ATGCAACAAA GGCTACGGCG ATAAAATATT CAGCTAATAG AGCCCAACCG 
GCAACCCATC CAAAAAATTC ACCAAATAAT ACATTGACCC AAGAATAGGC TGAACCTGCA 
AATGGCATAG CGGCAGCCAT TTCTGCATAA GTAAATGCAA CTAAACCAGC AACAATAGCA 
GCGAGTAAGA ATGATAACGC AACGGCCGGT CCTGCATGTT CTGCAGCAAC AATGCCAGGT 
AGCGTAAAGA TAGATGTCGA TACAATTGTT CCTACACCTA AAGCTAAGAA ATCACGCACC 
CGAAGTGTAC GCTTTAAATG ACCATCTTTA TTTTGATAGA TAGCCGGATC CTCTTTTCGT 
GCTATTTTAT TGAAAAAACT TCCCATAAAC TTTCCTCCCA AACATTCATA AACAATTCTA 
TACGGTGTTT TTTAATATGT TATATCATAG CACAAATAAT CAATATTTTG TCTAAAAATT 
CTGAAAAATC ACAACTTTAT GTTACGTATT AATGACTTGT CTTGATAACA TC CAT AG ATT 
TTTTAAATGA TAAAACTGAT TATAACAGAT ATTAAATGAA TAAGTACTAT TTTTTGCnAA 
TTTTCTAACA ATTTTGCACA TTATATGTTT AAAATCAATT TCATGTTTAT GGTCTGATTG 
GCTAGTGTGT ATGAAATGTA AnTCTTTGAC TnnGA 
(2) INFORMATION FOR SEQ ID NO: 120: 

' (i} SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13508 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
ATCAGGTAAT GCCATGCGTT TAGCTGAAAA TTTTTTCAGA ACGTTTAAGT GATATCGGAC 
ATCAAGTTGT TTTGATGTCA ATGGATGAAT ATGATACGAC AAACATCGCG CAGTTAGAAG 
ATTTATTTAT TATTACGTCT ACTCATGGTG AAGGAGAACC GCCTGATAAT GCATGGGATT 
TCTTTGAATT TTTAGAAGAC GATAACGCAC CTAATTTAAA TCATGTGAGA TATTCAGTAC 
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TACTAGAAAA 


TCTAGGCGCT 


GAGCGTATAT 


GTAAGCGTGT 


AGATTGTGAT 


ATTGATTATG 


^£n 




AAGAAGACGC 


AGAAAAGTGG 


ATGGCAGACA 


TCATTAATAT 


TATTGATACC 


ACATCAGAAG 


40ft 


5 


GTATTCAAAG 


TGAATCGGTG 


ATAAGTGAAT 


CAATTAAGTC 


TGCCAAAGAA AAGAAATATT 


4 An 




CTAAATCAAA 


TCCATACCAA 


GCAGAAGTAT 


TAGCGAATAT 


CAATTTAAAT 


GGTACCGATT 


^4 n 


10 


CAAATAAAGA AACACGACAT 


ATAGAATTTT 


TACTTGATGA 


TTTTAGTGAA 


TCATATGAAC 


fi n n 


CAGGAGATTG 


TATAGTAGCA 


TTACCGCAAA 


ACGACCCTGA 


ATTGGTTGAA 


AAACTAATAT 


ool) 




CCATGTTAGG 


TTGGGATCCG 


CAATCTCCGG 


TGCCAATTAA 


TGATCATGGT 


G AT ACAGTT C 


Ton 


15 


CTATTGTTGA 


AGCACTAACA 


TCACATTTTG 


AATTTACTAA 


ATTAACATTG 


CCATTATTGA 


/oU 




AAAATGCAGA 


TATCTATTTT 


GACAATGAAG 


AATTATCTGA ACGTATTCAA 


GATGAGTCAT 


a a r\ 
(54 0 




GGGCGCGTGA ATATGTTATA 


AATCGGGACT 


TTATAGATTT 


AATAACAGAT 


TTTCCAACTA 


Qnn 


20 


TAGAATTACA 


ACCTGAGAAT 


ATGTATCAAA 


TCCTTAGAAA 


ATTACCACCA AGAGAGTATT 






CGATTTCTAG 


TAGTTTTATG 


GCAACGCcAG 


ATGAAGTGCA 


TATTACCGTT 


GGTACGGTTC 


1 ao n 
iUA U 




GTTATCAAGC 


ACATGGACGT 


GAGAGAAAAG 


GTGTATGCTC 


GGTTCA'ITIT 


GCTGAGCGAA 


1 ft ft ft 


25 


TTAAACCAGG 


CGATATAGTA 


CCAATTTATT 


TGAAGAAAAA 


TCCGAACTTC 


AAATTTCCGA 






TGAAGCAAGA 


TATACCGGTT 


ATTATGATTG 


GACCAGGTAC 


TGrAATTGCT 


CCTTTTAGAG 


i ?nn 


30 


CATATTTACA 


AGAACGTGAA 


GAACTTGGTA TGACTGGAAA 


AACATGGTTG 


TTCTTTGGTG 


*r o c ft 


ATCAACACCG 


TAGTTCTGAC 


TTTTTATATG 


AAGAAGAAAT 


AGAAGAATGG 


CTTGAAAATG 


i "inn 




GAAACTTAAC 


ACGCGTAGAT 


TTAGCATTTT 


CAAGAGACCA 


AGAACACAAA 


GAATATGTAC 


IJOU 


35 


AGCATCGTAT 


AATGGAAGAA 


AGTAAACGTT 


TCAATGAATG 


GATTGAGCAA 


GGCGCACAAT 


i a, 4 n 




CTATATTTGT 


GGCGATGAAA 


AATGTATGGC 


GAAAGATGTC 


CATCAAGCCA 


TTAAAGATGT 






ATTGSTAAAA GAACGTCATA TTTCTCAAGA AGAAGCAGAG 

* 


TTATTATTGC 


GACAAATGAA 


1 JU u 




ACAACAACAA 


CGCTATCAAC 


GTGATGTTTA 


TTAGCGATTG 


GTGTTAAATA 


TTTTAAGGTG 


1620 

X V? X. V 




TAATGATGTA AAAAGATATA 


AAGGATGTTG 


CTCAACATGA 


ATATGCCATT 


AATGATAGAT 


1680 

a t> U 




TTAACAAATA AAAATGTCGT 


CATAGTTGGT 


GGAGGCGTCG 


TTGCAAGTCG 


TCGGGCACAA 


1740 


A C 

45 


ACATTAAATC 


AATACGTTGA 


ACATATGACG 


GTCATCAGTC 


CGACAATCAC 


TGAAAAACTT 


1800 




CAAAATATGG 


TAGATAACGG 


TGTCGTCATA TGGAAAGAAA AAGAATTTGA ACCAAGCGAT 


1860 


SO 


ATTGTAGACG 


CGTATCTAGT 


TATTGCAGCA 


ACCAATGAGC 


CACGTGTCAA 


TGAAGCGGTA 


1920 


AAAAAAGCCT 


TACCTGAGCA 


TGCCCTTTTT 


* 

AATAATGTTG 


GAGATGCATC 


AAATGGCAAT 


1980 




GTTGTATTTC 


CAAGTGCACT 


ACACCGCGAC 


AAGCTAACTA 


TCAGTGTATC 


AACTGATGGT 


2040 
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10 



15 



25 



TACAGTTCGT ATATCGACTT TTTATATACT TGCCGACAGA AAATAAAAGT ACTTGATATA 
ACATATAACG AAAAGCAACA GTTACTGTCA CAAATTGTGT CACAAGAATA TTTAAATCAT 
GACAAACAAG CTCAATTTTT AGCGTGGTTG GATGTAAGAT AATAATAGCG GACCGTCTAA 
CCGTCTAAGG TAAGTCTTCT TATTTTAACT TTAACGCTTA ATCATTGAAA TTAAGACATG 
GGCGGCTTTG TGAATAGTCT AATAATGAAG GATTTAAGCG ATAATGATAT GCGTTTTAAA 
TATGAATATT ACAATAGAGA AAAAGATACG TAGAACAAAC TTAATAAAAT AGGTGGATAA 
ATTGAAATCT GGTTGAAGTC GTTACTATCA TAGCGACCTT TAGCCAGATT TTTTGTGCAA 
TAGAAAGCAA TAATAAAAAT GATAGATCAA AATGAAATAC AGGACAGGAT ATACAAGGAT 
TAGTCATGCC ATGTTATCAA GTAGGAAAAT CAAACTTCAC TATTGATAGT TACGCAAAAA 
AGATTTTTTT GATAAAATGA GATAACTTAA ATATAAAAAA TTATATTAAT TATAATATTT 
20 AAGTTAAAGA GGGGGATTAT GTAAATTGTA TTAAAAGTGG AGGGAGAAAA TAATATGAAT 

AGTGATAATA TGTGGTTAAC AGTAATGGGG CTCATTATTA TTATTTCAAT TGTAGGTTTA 
CTCATTGCCA AAAAGATAAA TCCAGTTGTA GGTATGACAA TCATACCTTG CTTAGGGGCA 
ATGATTTTAG GATATAGTGT GACAGATTTG GTTGGATTTT TTGCTAAAGG GTTAGATCAA 
GTCATCAACG TTGTTATTAT GTTTATCTTT GCCATTATTT TCTTTGGCAT CATGAACGAT 
AGTGGTTTAT TCAAGCCGCT TGTCAAACGC TTAATATTAA TGACACGAGG CAATGTCGTC 
ATTGTCTGTG CAATGACAGC TTTAATTGGC ACAATAGCCC AATTAGATGG GGCCGGTGCG 
GTAACATTTT TGCTTTCTAT TCCTGCATTA TTACCTTTAT ATAAAGCGTT AAATATGAAT 
AAATATTTAT TGATTTTACT ATTAGCATTA AG CGCGGCG A TTATGAACAT GGTACCTTGG 
GGAGGTCCAA TGGCTCGTGT AGCTGCAGTG TTAAAAGCCA AAAGTGTCAA TGAATTATGG 
TATGCATTAA TACCTATTCA AATAATAGGT TTGATTCTTG TTATGTTGTT TGCGGTATAT 
40 CTTGGATTTA AAGAACAGAA ACGTATCAAA AAAGCAATAG AGAGAAATGA ATTACCGCAA 
ACACAAGATA TAGATGTACA TAAATTAGTT GAAGTATATG AACGAGATCA AGATGTAAGG 
TTTCCTGTAA AAGGACGTGC AAGAACAAAA TCATGGATAA AATGGGTGAA TACAGCTTTA 
ACTTTAGCTG TTATTCTATC GATGTTAATA AATATTGCGC CACCTGAATT TGCATTCATG 
ATAGGTGTTy CGTTGGCACT TGTTATTAAT TTTAAATCAG TGGATGAACA AATGGAACGA 
TTAAGAGCgC ATGCGCCGAA TGCATTAATG ATGGCTGCAG TGATTATTGC AGCAGGTATG 
TTTTTAGGTG TACTAAATGA AACCGGTATG CTTAAAGCGA TTGCGACCAA TTTAATCAAA 
GTGATTCCTG CAGAAGTAGG ACCATACTTG CATATTATTG TAGGTTTACT TGGCGTACCA 
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2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3130 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 
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ACAGCAGGGC AATTTGGTGT ACCGTCTGTA TCAACAGCTT ATTCAATGGT CATAGGGAAT 
ATTATAGGTA CATTTGTCAG CCCATTTTCA CCAGCCTTAT GGTTGGCAAT TGGTTTAGCA 
GAGGCAAACA TGGGCACGTA TATTAAGTAT GCATTCTTTT GGATTTGGGG ATTCGCTATC 
GTTATGTTAG TAATTGCAAT GTTGATGGGC ATTGTGACGA TTTAAGTATG AAAAAATAGA 
AACTATGGTC ACGTTGCAAA ATGAAATAAT AGTTGCATAA ACATGTCGAA ATGACGGACG 
AATCTTTAAA CAATTTTAAA AATTAATGAA ATAATTGTGT AGAAATATGA ATTTCACTAA 
ATGTTAATAA CTTTGTGACG TTTTAGTTAA CAGACTAATA AAAATTTGAA AATACTATAT 
ATAGTGGTAT AACGTAATGA GTAGACACAA TATATAGGAA GAAGGGGTAA AATGAATCAA 
ATCGAAGAAG CATTAACGGG TTTGATTTCT AAAGATCCTG CTATTGTTAA CGAAAATGCT 
AACAAAGATA GTGATACATT TTCAACAATG AGAGATTTAA CAGCAGGTAT CGTTTCTAAA 
20 TCTTACGCAT TAAATCATTT ATTACCAAAG CACGTTGCAG ATGCACATCA AAGAGGGGAC 
ATACATTTTC ACGACTTAGA TTATCATCCA TTCCAACCGT TAACTAACTG TTGTTTAATA 



15 



25 



30 



GATGCTAAAA ATATGCTACA TAATGGATTT GAAATAGGCA ACGCGAATGT AACTTCACCA 
AAATCAATAC AAACTGCATC AGCGCAGCTT GTACAAATTA TAGCCAATGT TTCTAGCAGT 
CAATATGGTG GCTGTAcGGT TGACCgCGTT GACGAATTAC tTAGTACATA TGCACGACcA 
TAATGAAGAA CAACATAGGA ATATsCGCAA AGCAATTTGT CAAAGAATCT GAAATTGATC 
GTTATGTTGA TCAACAAGTC ACTAAAGACA TCAATGATGC GATTGAAAGT TTAGAATATG 
AAATTAATAC CTTATATACA TCTAATGGAC AGACACCTTT TGTAACATTA GGATTCGGCT 
TAGGTACAGA TCATTTAAGT CGCAAAATTC AACAAGCTAT CTTAAATACT CGTATCAAAG 
GCTTAGGAAA AGACCGCACG ACAGCGATTT TCCCGAAACT TGTATTTTCA ATTAAAAAAG 
GAACQVACTT TAGTCCGCAA GATCCGAACT ATGACATTAA ACAACTAGCA TTAAAGTGTT 
CAACGAAACG TATGTATCCA GATATTTTAA ATTATGACAA ACTCGTAGAA ATATTAGGTG 
ATTTCAAAGC GCCAATGGGT TGTCGTTCAT TTTTACCAAG TTGGAAAGAT GCGGAAGGTC 
ATTTTGAAAA TAATGGTCGT TGTAATCTTG GTGTTGTTAC ACTTAATTTA CCTAGAATGG 
45 CATTAGAATC TGCCGGTAAT ATGACGAAAT TCTGGGAAAT CTTTTATGAA CGTATCGATG 
TGTTACATGA TGCATTACTT TATCGTATAA ATCGTTTGAA AGATGCTGTA CCGAATAACG 
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CACCGATTTT ATATAAAAGT GGCGCATTTA ACTATAAATT AAAAGAAACA GATGATGTTG 
CTGAGTTATT TAAAAATAAA CGTGCAACGA TTTCAATGGG CTATATAGGG TTGTATGAAA 
CAGCTACTGT TTTCTATGGT CCAGACTGGG AAACATCTCA AGAAGCAAAA GCATTTACGC 



3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4520 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 
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GGTTCAGTAT TTmCAGTACG CCGAGTGAAT CGCTAcGGAT CGTTTTTGTC GTTTAGACCA 5760 

AGAGAGATTT GGAGATATTA AAGACATTAC AGATAAAGGA TATTATCAAA ACTCTTTCCA 5820 

TTATGATGTA CGTAAAGATG TTACACCTTT TGAAAAGTTA GATTTTGAAA AAGATTATCC 58 BO 

TTATTATGCG AGTGGTGGTT TCATTCACTA TTGTGAGTAT CCGAAATTGC AACACAATTT 594 0 

GAAAGCACTA GAAGCGGTAT GGGACTACTC TTATGACAAA GTTGGTTACT TAGGTACAAA 6000 

TATTCCGATT GATCATTGTT ATGAATGTGA TTACGATGGA GATTTTGAAG CAACTGAAAA 6060 

AGGATTTAAA TGCCCGAACT GTGGCAATGA TAATCCTAAA ACAGTTGATG TCGTTAAACG 6120 

15 AACATGTGGT TACCTAGGCA ATCCAGTTCA ACGTCCAGTA ATTAAAGGCC GTCATAAAGA 6X80 

AATTTGCGCA CGAGTAAAAC ATATGAAAGC GCCTAAAGAA TGATACTTTT AGACATTAAA 624 0 

CAAGGACAAG GTTATATTGC TAAAATAGAA TCAAATAGCT TTGTTGACGG TGAAGGAGTA 6300 

20 AGATGCAGTG TTTATGTATC AGGATGTCCA TTTAATTGTG TTGGATGTTA TAACAAAGCC 6360 

TCACAAAAGT TCAGATATGG CGAGAAATAC ACTGATGAAA TATTAGCAGA AATATTAGAT 6420 

GATTGCGATC ATGATTATAT ATCTGGGCTA AGTCTATTAG GTGGCGAACC ATTTTGTAAT 6480 

TTGGATATTA CATTAAATCT TGTCAAAGCA TTTCGAGCAC GTTTTGGAAA TACAAAGACA 654 0 

ATTTGGGTAT GGACTGGATT TTTATATGAA TATTTAGCAA ATGATTGTAC AGAACGTCGA 6600 

GAGTTATTAT CATACATTGA CGTTTTAGTA GATGGTCTAT TTATACAACA CTTATTCAAA 6560 

CCTGATTTAC CATATAAAGG TTCTTTAAAT CAACGCATTA TAGATGTACA ACAATCACTC 6720 

TCGCATGCGC GTATGATTGA ATATATAGTT AGTTGAATAT GTATTAGAAG TCAAGGTAAC 6730 

ATTCGTTGCC TTGGCTTCTT TTTAGGTTAG GTACATAATT GAAAGTTAAT AAAAGCAATT 634 0 

CTTTATAAAA ATATATTGAT AGAATATGAC CTAACAATCA TTTTGATACC AATACTAAAA 6900 

GTTGCAXATC CGTTTTTTAA AAAAGTTGAA AGAGAAAAGT GGTATTTTAG TGGGAAGGAA 6 960 

40 GTCEAACTTT TTGGTAGCGT TTTACAATAA ATAAATATTC GTTAATAACG TATAAATATT 702 0 

CTTAAATGCC ATTCTAGTAA AATTTGTTAA ATTCGTTAAA TCGTAACTTA ACACTGTTAT 7080 

TTTAGCGCTA TTAAGGTTTT GTTTATTACG GGAAAAATTA TATAAATATT CAATAATTGC 714 0 

45 CAAGTTTCAA ATTGTATGAA ATTTGCATTA TTATTAAATG TTAGTTATTG TCAATTTTGT 72 00 

GAATCAATAT AATTATTACA TTTTGAGATA AATCGAAACA GGATTCATAA AATTAATAAT 7260 

TAGGGGGAGC ACAATTGAAA AAAGAGAAAG TTATGGACTG GACGACCTTT ATAGGGACAG 7320 

TAGCTGTACT TCTTTTTGCA GTTATACCTA TGATGGCTTT TCCAAAAGCA AGTGAAGATA 7380 

TCATCACTGG TATTAATAGT GCCATTTCTG ATTCAATTGG TTCGATATAT TTATTTATGG 7440 
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TTGGTAAAGC AAGTGATAAA CCAGAATTTA ATACATTTAC ATGGGCGGCA ATGCTGTTTT 7560 

GTGCAGGCAT AGGCTCTGAT ATTTTATACT GGGGCGTTAT TGAATGGGCT TTTTACTATC 7620 

AAGTTCCACC AAATGGCGCG AAAAGTATGA GTGATGAAGC ACTCCAATAT GCGACGCAAT 76 BO 

ATGGTATGTT CCACTGGGGG CCAATTGCTT GGGCTATTTA TGTTCTACCA GCATTACCAA 7740 

TTGGTTATTT AGTATTTGTT AAAAAACAAC CGGTGTATAA AATTAGTCAA GCTTGTCGTC 7800 

CGATTTTAAA AGGTCAAACA GATAAATTTG TAGGTAAAGT TGTAGATATC TTATTTATCT 7860 

TTGGATTGCT AGGTGGTGCG GCAACATCAC TAGCGTTAGG TGTGCCATTA ATTTCTGCAG 7920 

15 GCATAGAAAG ATTAACTGGT TTAGATGGTA AAAATATGAT TTTACGTTCG GCCATTTTAT 7980 

TAACAATCAC GGTTATATTT GCCATTAGTT CATATACAGG ATTGAAAAAA GGTATTCAAA 8040 

AGTTAAGTGA TATCAACGTT TGGCTATCCT TTGTACTTTT AGCCTTTATA TTTATTATTG 8100 

20 GACCGACTGT TTTTATTATG GAAACGACAG TGACAGGGTT CGGAAATATG TTGAGAGATT 8160 

TCTTTCATAT GGCAACATGG TTAGAACCAT TCGGTGGTAT TAAAGGTCGA AAAGAAACGA 8220 

ATTTCCCACA AGACTGGACA ATATTCTACT GGTCATGGTG GTTAGTATAT GCGCCATTTA 8280 

TCGGTTTATT TATCGCTAGA ATTTCAAAAG GTCGACGCCT TAAAGAAGTC GTGCTAGGAA 8340 

CAATTATTTA TGGAACGCTT GGATGCGTAT TATTCTTTGG TATTTTTGGT AACTATGCTG 84 00 

TGTATTTACA AATTTCTGGA CAGTTTAATG TAACACAATA TTTAAATACA CATGGTACAG 8460 

AGGCAACCAT TATTGAAGTG GTGCATCATT TACCATTCCC ATCATTCATG ATTGTACTAT 8 520 

TCTTAGTATC TGCTTTCTTA TTCTTAGCAA CAACATTTGA TTCGGGTTCA TATATTTTAG 8 58 0 

CGGCAGCATC TCAGAAAAAA GTGGTAGGCG AACCATTACG TGCCAATCGT TTATTCTGGG 8 64 0 

CATTTGCATT GTGCTTATTG CCATTTTCAT TGATGCTAGT TGGTGGTGAA CGTGCATTAG 8700 

AAGTATTGAA AACTGCTTCA ATACTGGCAA GTGTGCCATT AATTGTTATT TTTATTTTCA 8760 

■ 

40 TGATGATATC ATT7TTAATC ATTTTAGGGC GCGATAGAAT TAAACTTGAA ACGCGTGCTG 8S20 

AAAAATTAAA AGAAGTTGAA CGTCGTTCAT TGCGAATCGT TCAAGTATCa GAAGAAGAAC 8880 

AAGACGATAA TTTATAATTC AAAGCGGGTC TGGGACGACG AAATGaATTT TGTGAAAATA 894 0 

TCATTTCTGT TCCaTTCCCC TTTTTTTAGT AGCATTGTAG GATGAACTTT TAGGTTTTCA 900 0 

TTAATGTTGT ACTAAAAGAT TTAATTTTTT AGTGCTCCAA GTACTTATTT ATTGTATGAA 9060 

GCATATTCTA AATCGAAGTT TGAAAGACTC TCATTGATTA TTAAATTAAA TAAAGGGTAT 9120 

GCGTATGTAC AATTCAAATT AATCGAAGGA TGAAATAAAA TGACTAATCA ATTTAAAAAT 9180 

AAACAGTCCA AATTACATGA CAGTTTAGAA TCCATCACAA AAAACTTATA TG CG ACACCT 924 0 
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ACAGAATATT GTTATCTATC ATTCCGGACA CTTAGGTGAC TCCCAACAAG ACATTGCATC 9360 

ATTAGGTGGT GTTTCAAAAG TATTGATGAA TCATGATCAT GAATCTATAG GAGGTTCTAA 9420 

TCAAGTTGAA GCCCCTTACT TTATACATGA AAATGATGTG GCTGCACTGA AACATAAGAT 9480 

TTCTGTTCAA AAACAATTTA GTAATCGTGT AATGTTGGAT AAGGATTTAG AAGTTATTCC 9540 

CGCGCCTGGA CATACACCAG GGACGACACT ATTTTTATGG GATGATGGTC ATCACCGTTA 9600 

CTTATTTACT GGAGATTTTA TATGTTTTGA AGGGAAGAGA TGGCGTACAG TTATATTAGG 9660 

TTCAAGTGAT AGAGAAAAAT CTATTCAAAG TTTAGAGATG GTTAAAGAAT TAGATTTTGA 9720 

TGTACTTGTA CCTTGGGTTA CTATCAAAGA TGAACCGTTA GTTTATTTTG TAGAAAATGA 9780 

ATATGAAAAA CGTGAACAAA TACAAAATAT TATTGATAGA GTACGTGAGG GCGAGAATAG 9840 

CTAATTGAAA TATATTGGCG AAgCAATGTA ACGAATCTAA GAAAGCCCTA GAAAATACCT 9900 

CCATAATTGA TTGTCATATA AAACAAAAAC GGTAATTTCT ATTTATTGAG ATAGAAATTA 9960 

CCGTTTATTT CGTGGACCTA TTGCATTGTT TTTATCATGC ATAATCATCA TTGTCGTTGT 10 020 

TTGAGTCAAT TTTAATTTTC AGAATCAGAA GGCTGTTCTG GAATTGGGAA ATATTTGAAA 100 80 

ATTTCACCGC TTTCAATCGC TTCGGTTAAC TGTTCTAACC ATTCGTAATA AACATGTGTA 10140 

TGATCAAGCT GAGCTTTAAT TTTTTGTGCC TCTTGTGTTT CAGCTTCAGT TAAATCACTG 10200 

CTTTCAAGTA ATGGATTGAT AATAGCTTGA GCATCTTTTA CTGCTTCGAC ATTGATGTCA 10260 

ATTTCACGCT GGAATTTTTT AGTGAAAAAG TTTCGGAAAA AGATGAAAAA GTCTTTCTCG 10320 

GCGATAAAAT GTTGTTTGCG GCTTCCTCTC GTAAATTGTT GTTTAACAAT ATCAAATTCC 103 80 

TGCAATTTCT TAACGCCAGC ACTCATACTT GGTTTGCTCA TTTGCAATTG ATGACGCATT 1044O 

TCATCAAGCG TCATACTGCC TTCAAACACC ATTGTGCCAT ATAAGTTTCC TACACTTCTA 10500 

TTAGTGCCAT ACAAAT C CAT 7GTCTGTCCA ATTGAATTAA TTACAATATC TTTTGCTTGT 10560 

m 

TCTAATTGTT GCTGTTTGTT CTGAGAACGA GTCATCATTG CACCTCCGTA CATCATTTTG 10620 

GTCACGTTAA AATAAATACT AATACATTAT AAAACCTTTT CTAAAAAAAG ACATTAAAAA 10680 

TATTTAAAGC ATTAAAGTTA AATGTTTCGT TAAATAAAAA TCTAACGAAC TTACAAAACT 1074 0 

TAATTCTTGA GTTGTTTTGT AAATTGACAC ATTTTTCATT TCTATGCTAA CATAAGTnTG 10800 

TAAAATTcGT TAAATAAAAA TTTAACAAAC TTAACGGrGG TTGTTGAAkG GrACTTTTAA 10860 

aACATTTATC TCAGCGTCAA TATATTGATG GTGAGTGGGT TGAAAGCGCG AATAAAAATA 10920 

CAAGAGATAT TATCAATCCT TACAATCAAG AAGTGATATT TACGGTTTCT GAAGGGACAA 10980 

AAGAGGATGC AGAACGTGCA ATCTTAGCTG CAAGACGTGC GTTTGAGTCT GGTGAATGGT 11040 
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AACATCgCGA AgCgTTAGCA CGATTAGAAA 
CATATGCAGA TATGGATGAT ATTCATAATG 

5 

AAGACGGTGG CGAAATGATT GATTCACCAA 
AACCAGTAGG TGTAGTTACA CAAATTACAC 
GGAAAATTGC GCCAGCGCTT GCTACGGGTT 

10 

CACCATTAAC AACAATACGT GTTTTTGAAT 
CAATTAATCT TATTCTAGGT GCAGGTTCTG 

15 AGGTTGACCT TGTATCATTT ACAGGTGGCA 
CTGCTAATAA TGTTACGAAT ATTGCCTTGG 
TTGATGATGC TGATTTTGAA TTGGCAGTAG 

20 CAGGTCAAGT TTGTTCAGCA GGATCAAGAA 
TTGAGCAAGC ACTTATTGAT CGCGTGAAAA 
ATACTGAAAT GGGACCAGTG ATTTCAACAG 

25 ATGTAGCTAA AGCAGAAGGC GCAACAATTG 
ATTTAAAAGA TGGTCTATTC TTCGAGCCAA 
GTATTGTACA AGAAGAGGTT TTCGGACCTG 

30 

AAGAAGCGAT TCAATTAGCG AATGATT CTA 
AAGATATTGG AAAAGCACAA CGCGTTGCTA 
ATGATTTCCA TCCATATTTT GCACAAGCGC 

35 

GTAGAGAATT AGGCAAAGAA GGCTTAGAAG 
ATACAAATCC ACAATTAGTG AATTGGTTTA 

40 TTGTAAGAAC ACAAGACACT CACTTTGTTT 
TTGGACTAAA CGCAAAATGA ATCATAGATT 
GGAAAAGCGA GTGTTTTGGT TAGCTAAGTT 

45 AACAAATATT TCATGCAATA CTCACTTTGA 
GAGTAACAAA AACAAATCAT ATGATTATGT 
ACTAGGTAAT CGTCTGAGTG AAGATAAAGA 

50 

CAGTGATTAT TTTTGGGATT TATTTATCCA 
CAATAAATTT TACGATTGGA TTTATTCAAC 
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CATTAGATAC 


TGGAAAAACG 


TTAGAAGAAT 


11160 


TGTTTATGTA 


TTTTGCTGGA 


TTAGCAGATA 


11220 


TTCCAGATAC 


AGAAAGCAAA ATTGTTAAAG 


11280 


CTTGGAATTA 


TCCGTTATTA 


CAAGCATCAT 


11340 


GTTCACTAGT 


TATGAAACCA AGTGAAATTA 


11400 


TAATGGAAGA AGTTGGTTTC 


CCTAAAGGAA 


11460 


AAGTTGGTGA 


CGTAATGTCA GGTCATAAAG 


11520 


TTGAGACTGG 


TAAGCATATT ATGAAAAATG 


11580 


AACTTGGCGG 


TAAAAATCCA 


AACATTATCT 


11640 


ACCAAGCGTT AAATGGTGGA TATTTCCATG 


11700 


TATTAGTACA AAACAGTATT 


AAAGACAAAT 


11760 


AAATCAAATT 


AGGTAATGGT 


TTTGATGCTG 


11820 


AACATCGTAA 


TAAGATCGAA 


TCTTATATGG 


11880 


CTGTTGGTGG 


TAAACGTCCA 


GATAGAGATG 


11940 


CAGTCATTAC 


AAATTGTGAT 


ACGTCAATGC 


12000 


TCGTTACTGT 


AGAAGGCTTT 


GAAACTGAAC 


12060 


TATATGGTTT 


AGCAGGTGCT 


GTATTTTCTA 


12120 


ACAAGTTGAA 


ACTTGGAACG 


GTGTGGATTA 


12130 


CATGGGGTGG 


ATACAAACAA 


TCAGGTATCG 


12240 


AGTACCTTGT 


TTCAAAACAC 


ATTTTAACAA 


12300 


GCAAATAAAA 


ATTAGATAAG 


GTGAGTGCCA 


12360 


TGTATAAGTG 


GCGAAATGTT 


GATTGATAAT 


12420 


ATTTCATTAC 


TGTTAGTAAC 


AATCGTAAAA 


12480 


TAGCAATTCA 


ACGATAACCA 


ATCAGCCACT 


12540 


AATACAACAA 


ACTTTGGAGG 


TCATAACGAT 


12600 


CATCATTGGA 


GGAGGCAGTG 


CAGGTTCTGT 


12660 


TAAAGAAGTC 


TTAGTATTAG 


AAGCGGGTCG 


12720 


AATGCCTGCT 


GCGTTAATGT 


TCCCTTCAGG 


12780 


AGATGAAGAA 


CCACATATGG 


GCGGTCGTAA 


12840 
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TCAACGTGGT AATCCAATGG ACTATGAAGG CTGGGCAGAA CCAGAAGGTA TGGAAACTTG 12960 

GGATTTTGCG CACTGTTTAC CGTATTTTAA AAAATTAGAA AAAACATACG GTGCAGCGCC 13020 

TTATGATAAA TTTAGAGGCC ATGATGGACC AATTAAGTTA AAACGAGGGC CAGCAACGAA 13080 

TCCTTTATTC CAGTCATTCT TTGATGCAGG TGTTGAAGCA GGCTATCATA AAACACCTGA 1314 0 

TGTGAATGGA TTTAGACAAG AAGGTTTTGG ACCGTTCGAT AGTCAAGTAC ATCGTGGTCG 13200 

CCGAATGTCA GCTTCAAGAG CATATTTACA TCCAGCGATG AAGCGTAAAA ACTTAACCGT 13260 

TGAAACACGT GCCTTTGTAA CTGAAATTCA TTATGAAGGT AGAAGAGCAA CTGGTGTTAC 13320 

GTATAAGAAA AATGGCAAAC TACATACCAT CGATGCTAAT GAAGTCATTT TGTCTGGTGG 13380 

GGCATTCAAT ACGCCACAAT TACTACAATT ATCTGGTATC GGTGATTCAG AGTTCCTAAA 13440 

ATCAAAAGGC ATTGAGCCAC GTGTTCATTT ACCTGGTGTG GGTGAAAACT TTGAAGATCA 13500 

CTTAGAGG 13508 
(2) INFORMATION FOR SEQ ID NO : 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7645 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 



GTAAGTATTG 


TCTTGATTTC 


CTAATAAAGT 


TATATCTTGT 


AATTCATCTT 


GTTGACGGCC 


60 


ATGTGCCATA 


TAAAGCG CTC 


CTTTAAATTT 


ATTTTTTTAT 


TATTTTGGCG 


TCTCGGCGTG 


120 


CTTTTTCAAA 


CATGTAATAA 


CTTGCACCGA 


TAATAACGAC 


GTAACCTAAT 


GTTGCATAGA 


180 


aatcSggaga 

• 


TTCTCCGAAT 


AGAATAAATC 


CAAGTATTGC 


TGTGAAAATT 


ATAGATGCAT 


240 


ACGTAAAAAT 


AGAAATATCT 


TTTGCTGCTG 


CAAAACTATA 


TGCTAAAGTA ACAC CAATTT 


300 


GACCCACAGC 


GGCAgCTAAG 


CCAGCCCCTA 


ATAGATAAAG 


TATTTGCATC 


TGACTCATTG 


360 


GTTCATAAGT 


ATATGCAGTG 


AAAGGTATTA 


AAACGATGAC 


AGAAAATAAG 


GAGAAGTAAA 


420 


ATACTATAGT 


ATATGGTGCT 


TyTCTTGTAC 


TAAGTGCTCG 


AACACATGTA 


TATGCTGATG 


480 


CTGCAAAAAT 


ACCTGAGAAT 


AAGCCAGCTA 


ATGATGGAAT 


CATAGATGAT 


GAAAATTCAG 


540 


GTTTCACTAT 


TAAnAGCAaC 


CTAAAATAGC 


AATTATCATT 


GCTGTAATTT 


GaTACTTCCT 


600 


TACCTTTTCA 


TGtAAGAaaA 


CAATGCTTaA 


TAAAATCGTC 


CAGAAAGGAT 


TGAGTTTCAT 


660 


TAATGAATCG 


GCATCACTAA 


GTACCATATG 


ATCAATGGCA 


TAAATATTTA 


ACAATACACC 


720 
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TGGCTGATGG TATTTATATA TAAAAAATAA 
TAATGATTTT TGAAAAACAG GAAGGTCACC 

5 

GAAACCAATA GCCGAAATTA AAATGGCAAT 
CGCCTCTTTT ATATAAAATT AACGTATTTA 
ATAGTTGAAA TTTACTATAA AAAGACTATA 

10 

TATTTGTCGG AATAATAGGG CATTACACTT 
AATATCAATT CAGTATCAAG CTAATAAGCT 

15 TTGACACAGA TTTAAAAAAA TCAAGTGATA 

AGTTTTTCTA ATTTAGTATT GGTGCCTAGT 
ATGGCACTTT AAATCATAGT GTGTCTTATG 

20 AAACGAAAAA gACACAATAT CTTGTGTTTT 

ACATTTAAAA GTAATTTAAC ACAGAAATTT 
TTAGAAAATG TACTGAGCAA ATGGAAGATA 

25 TTTTATACAT TCAACCCATA TAAGCTACTA 

TTACATTTGA GAAAATAAGT AGCTTCATTA 
AACCATGTTG TTAAAGCATT TTTTAATTGG 

30 

AAGAAGGTGA 7CCAATGAAA ATAATATATT 
TTAAGAGAAC AGAACTTGAA AATACGCTTG 
TTCATGAACC GTTTATTATC GTTACTGGCA 

35 

TTCAATCTTT TTTAGAAGTT AATCATCAAT 
GAAATTGGGG ACTAAATTTC GCAAAAGCGG 

40 CTTTATTAAT GAAGTTTGAG TTACATGGAA 

AGGTGGGTAA TTTTAATGAA AACCATGGAA 
AATGAGGTCA CTAAACGAaG AGAAGATGGA 

45 TTAGTAGCTT ATTTAGAAGA AGTAAAAGAC 

CGTTTACGTT ATTTAGTAGA CAACGATTTT 
GCGGATCTAA TTGAAATCAC TGATTATGCA 

SO 

ATGTCAGCTA GTAAATTTTT CAAAGATTAC 
TTAGAAGACT ATAATCAACA CGTTGCCATT 

55 



TGGAATAAAC 


ATTGCTACTA 


AGTTTCGTGC 


B40 


TGCAAGTCTG 


AAAAACACTG 


ACATAAAACT 


900 


GATACCTTTT 


ACTTTAGGAT 


TCAATTTTAT 


960 


TATTAGCATA 


AAACAACATG 


TTGTGCATAA 


1020 


ATAGACTGTA 


GCGAACAAAC 


GTTCTGTGTT 


1080 


TTATGAATGT 


TTGTGTTATT 


ACATAAAACA 


1140 


TTTTCTTGAT 


TTCTGTTGAT 


ACAATTGAGA 


1200 


TCTACTAAAA 


AATTTTTTTA 


AATTTGTTCA 


1250 


TGGAACGTTT 


TACGAACATT 


CGATTAGAAA 


1320 


TATAATGAAA 


CACATAATAT 


AGTGTTGGTG 


1380 


GTATGCAAAT 


GCTTTATTTA 


TGAAGAAATT 


1440 


AATAGTTATT 


ATCAATTAAT 


AGTCATATTT 


1500 


TCCAATGATG 


TAAACACTAC 


ATATAGTGAT 


1560 


TTTTCTCAAA 


TATAAATCTA 


TGCAATTGGT 


1620 


TAGTTAATAC 


AATGCTGAGA 


TAACCATAGT 


1680 


AATGACTACT 


TTATTTAAAA 


GGGTTGAAGA 


1740 


TTTCATTTAC 


TGGAAATGTC 


CGTCGTTTTA 


1800 


AGATTACAGC 


AGAAAATTGT 


ATGGAACCAG 


1860 


CTATTGGATT 


TGGAGAAGTA 


CCAGAACCCG 


1920 


ACATCAGAGG 


TGTGGCAGCT 


AGCGGTAATC 


1980 


GTCGCACGAT 


ATCAGAAGAG 


TATAATGTCC 


2040 


AAAACAAAGA 


CGTTATTGAA 


TiTAAGAACA 


2100 


GAGAAAAAGT 


ACAATCATAT 


TGAATTAAAT 


2160 


TTCTTTAGTT 


TAGAAAAAGA 


CCAAGAAGCT 


2220 


AAAACAATCT 


TCTTCGACAC 


TGAAATCGAG 


2280 


TATTTCAATG 


TGTTTGATAT 


TTATAGTGAA 


2340 


AAATCAATCC 


CGTTTAATTT 


TGCAAGTTAT 


2400 


GCTTTGAAAA 


CAAATGATAA 


AAGTCAATAC 


2460 


GTTGCTTTAT 


ACCTAGCAAA 


TGGTAATAAA 


2520 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



ACATTTTTAA 


ACGCAGGCCG 


TGCGCGTCGT 


GGTGAGCTAG 


TGTCATGTTT 


CTTATTAGAA 


2640 


GTGGATGACA 


GCTTAAATTC AATTAACTTT ATTGATTCAA CTGCAAAACA ATTAAGTAAA 


2700 


ATTGGGGGCG 


GCGTTGCAAT 


TAACTTATCT 


AAATTGCGTG 


CACGTGGTGA AGCAATTAAA 


2760 


GGAATTAAAG 


GCGTAgCGAA AGGCGTTTTA CCTATTGCTA AGTCACTTGA AGGTGGCTTT 


2820 


AGCTATGCAG 


ATCAACTTGG 


TCAACGCCCT 


GGTGCTGGTG 


CTGTGTACTT 


AAATATCTTC 


2880 


CATTATGATG 


TAGAAGAATT 


TTTAGATACT 


AAAAAAGTAA 


ATGCGGATGA 


AGATTTACGT 


2940 


TTATCTACAA 


TATCAACTGG 


TTTAATTGTT 


CCATCTAAAT 


TCTTCGATTT 


AGCTAAAGAA 


3000 


GGTAAGGACT 


TTTATATGTT 


TGCACCTCAT 


ACAGTTAAAG 


AAGAATATGG 


TGTGACATTA 


3060 

"I* w w w 


GACGATATCG 


ATTTAGAAAA 


ATATTATGAT 


GACATGGTTG 


CAAACCCAAA 


TGTTGAGAAA 


3120 


AAGAAAAAGA 


ATGCGCGTGA 


AATGTTGAAT 


TTAATTGCGC 


AAACACAATT 


ACAATCAGGT 


3180 


TATCCATATT 


TAATGTTTAA 


AGATAATGCT 


AACAGAGTGC 


ATCCGAATTC 


AAACATTGGA 


3240 


CAAATTAAAA 


TGAGTAACTT 


ATGTACGGAA 


ATTTTCCAAC 


7ACAAGAAAC 


TTCAATTATT 


3300 


AATGACTATG 


GTATTGAAGA 


CGAAATTAAA 


CGTGATATTT 


CTTGTAACTT 


GGGCTCATTA 


3360 


AATATTGTTA 


ATGTAATGGA 


AAGCGGAAAA 


TTCAGAGATT 


CAGTTCACTC 


TGGTATGGAC 


3420 


GCATTAACTG 


TTGTGAGTGA 


TGTAGCAAAT ATTCAAAATG 


CACCAGGAGT 


TAGAAAAGCT 


3480 


AACAGTGAAT 


TACATTCAGT 


7GGTCTTGGT 


GTGATGAATT 


TACACGGTTA 


CCTAGCAAAA 


3540 


AATAAAATTG 


GTTATGAGTC 


AGAAGAAGCA 


AAAGATTTTG 


CAAATATCTT 


CTTTATGATG 


3600 


ATGAATTTCT 


ACTCAATCGA 


ACGTTCAATG 


GAAATCGCTA 


AAGAG CGTGG 


TATCAAATAT 


3560 


CAAGACTTTG 


AAAAGTCTGA 


TTATGCTAA7 


GGCAAATATT 


TCGAGTTCTA 


TACAACTCAA 


3 720 


GAATTTGAAC 


CTCAATTCGA 


AAAAGTACGT 


GAATTATTCG 


ATGGTATGGC 


7ATTCCTACT 


3780 


TCTGAGGATT 

• 


GGAAGAAACT 


ACAACAAGAT 


GTTGAACAAT 


ATGGTTTATA 


TCATGCATAT 


3840 


AGAtTAGCAA 


TTGCTCCAAC 


ACAAAGTATT 


TCTTATGTTC 


AAAATGCAAC 


AAGTTCTGTA 


3900 


ATGCCAATCG 


TTGACCAAAT 


TGAACGTCGT ACTTATGGTA 


ATGCGGAAAC 


ATTTTACCCT 


3960 


ATGCCATTCT 


TATCACCACA 


AACAATGTGG 


TACTACAAAT 


CAGCATTCAA 


TACTGATCAG 


4020 


Ik TCI 2X 21 A TT ZV & 


TCGATTTAAT 


TGCGACAATT 


CAAACGCATA 


TTGACCAAGG 


TATCTCAACG 


4080 


ATCCTTTATG 


TTAATTCTGA 


AATTTCTACA 


CGTGAGTTAG 


CAAGATTATA 


TGTATATGCG 


4140 


CACTATAAAG 


GATTAAAATC 


ACTTTACTAT 


ACTAGAAATA 


AATTATTAAG 


TGTAGAAGAA 


4200 


TGTACAAGTT 


GTTCTATCTA 


ACAATTAAAT 


GTTGAAAATG 


ACAAACAGCT 


AATCATCTGG 


4260 


TCTGAATTAG 


CAGATGATTA 


GACTGCTATG 


TCTGTATTTG 


TCAATTATTG 


AGTAACATTA 


4320 
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ATGTTTTGGA GACAAAATAT ATCTCAAATG 
GACATTGCAA GTTGGAAGAC TTTATCTGAA 
5 GCTGGTTTAA CAGGCTTAGA TACACATCAA 

CATACGACTG ACTTAAGGAA AAAAGCAGTT 
CACGCGAAAA GCTATTCACA TATTTTCACA 

10 

CTATTAGATG AATGGGTTTT AGAGGAACCC 
GCTAATTATC ACAAACTTTG GGGTAAAGAA 
GTTACGAGTG TATTTTTAGA AACATTCTTA 

75 

CTTGCTGGTC AAGGGAAAAT GACGACATCA 
GAATCTATTC ATGGTGTATT TACCGGTTTA 

20 GAAAGTGAGA AACAAAAAGC AGATCAAGAA 

AATGAAGAGT CATACACAAA AATGTTATAC 
AACTATGTTA AATATAATGG AAACAAAGCA 

25 GAGGAACGTG AATTTAACCC AATCATTGAG 

GACTTCTTCT CAGTAAAAGG TGATGGTTAT 
GATGATGACT TTGTATTTGA CAACAAATAA 

30 

GGGAAATAGC GATTCGTTTC G7CTTGTCTC 
TCTAACTCAT TATGAGTCTG AGTAAGAAAT 
ATTGGCAGTA GTTGGCGGGG CCCCAACACA 

35 

GTGCAAGTTG GCGGGGCCCC AACATAGAAG 
AAGTf GGCGG GGCCCCAACA TAAAAGCAGG 

40 TCGGgCGGGG CCCCAACATA AAGAAAAACT 
AGTTTTACTC ATGTATTCCT ATTTTTAAGT 
ACTACTTAAT CAATCATTAG TAGTTTTTAT 

45 TAAGTGTTCT ATTTTACTTT AAGTAAACAA 

TTAATTGCAA ATATCAATAA AATTGACACT 
TCAAAAAACG CTACTATTAA TGAGAAATAT 

SO 

AGGGAGAAAA ATTTGTAATG AAGTATTTAT 
TATTGTTGAC AATTATTTCG TTGTTCATAG 

55 



TGGGTTGAAA 


CAGAATTTAA AGTATCAAAA 


4440 


GCTGAACAAG ACACATTTAA AAAAGCATTA 


4500 


GCAGATGATG 


GCATGCCTTT 


AGTTATGCTA 


4560 


TATTCATTTA 


TGGCGATGAT 


GG AG CAAATA 


4620 


ACACTATTAC 


CATCTAGTGA 


AaCAAACTAC 


4680 


CATTTAAAAT 


ATAAATCTGA 


TAAAATTGTT 


4740 


GCTTCGATAT ACGACCAATA TATGGCCAGA 


4800 


TTCTTCTCAG 


GTTTCTATTA 


TCCACTATAT 


4860 


GGTGAAATCA 


TTCGTAAAAT 


TCTTTTAGAT 


4920 


GATGCACAGC 


ATTTACGAAA 


TGAACTATCT 


4980 


ATGTATAAAT 


TGCTAAATGA 


CTTGTATTTA 


5040 


GATGATCTTG 


GAATCACTGA 


AGATGTGCTA 


5100 


CTTTCAAACT 


TAGGCTTTGa 


ACCTTATTTT 


5160 


AATGCCTTAG 


ATACAACAAC 


TAAAAACCAT 


5220 


GTATTAGCAT 


TAAACGTAGA 


AGCATTACAA 


5280 


CAATTAAATT 


AAAAGACCTT 


CACATGTAAA 


5340 


CTACATGTTG 


AAGGTCTTTT 


TTTATGTGTA 


5400 


CAATGCTCTA 


AGATGTACAA 


TGCTATTTAT 


5460 


GAAGCAGGCG 


GAAAGTCAGC 


TAACAATATT 


5520 


CAGGCGGAAA 


GTCAGCTAAC 


AATAATGTGC 


5580 


CGGAAAGTCA 


GCTAACAATA 


TTGTGCAAGT 


5640 


TTTTCCTTTA 


GAAATTATCA 


CTTCCaCaTG 


5700 


ACACATTAGC 


TGAGGCTAAT 


GTTAAGAACC 


5760 


CATTTCCACT 


ATTCCCaGAC 


ATCaAAATCT 


5B20 


AATACACATT 


CCGAAAAATT 


AAATTTCAGT 


5880 


AAATTATTTG 


AAAGGCTATT 


GAAATTATGG 


5940 


TATCAATGAT 


AATGATTATC 


ATTAATTTAA 


6000 


TAAAGGGAAA 


TATTTTGCTT 


CTATTACTAA 


6060 


GTGTGAGTGA 


ACTATCAATT 


AAAGATTTAC 


6120 
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GTATTTTAAT TGCTGGAAGT TCGTTGGCTT TAGCAGGCTT GATAATGCAA CAAATGATGC 624 0 

AAAATAAGTT TGTTAGTCCG ACTACAGCTG GAACGATGGA ATGGGCTAAA CTAGGTATTT 6300 

5 

TAATTGCTTT ATTGTTCTTT CCAACCGGTC ATATTTTATT AAAACTAGTA TTTGCTGTTA 6360 

TTTGCAGTAT TTGCGGTACG TTTTTATTTG TTAAAATCAT TGATTTTATA AAAGTGAAAG 6420 

ATGTCATTTT TGTACCGCTT TTAGGAATTA TGATGGGTGG GATTGTTGCA AGTTcACAAC 6480 

10 

CTTCATCTCA TTGCGCACGA ATGCTGTTCA AAGCATTGGT AACTGGCTTA ACGGGAACTT 6540 

TGCCATTATC ACAAGTGGAC GCTATGAAAT TTTATATTTA AGTATTCCTC TTTTAGCATT 6600. 

15 GACATATCTT TTTGCTAATC ATTTCACGAT TGTAGGAATG GGTAAAGACT TTACTAATAA 6660 

TTTAGGTTTG AGTTACGAAA AATTAATTAA CATCGCATTG TTTATTACTG CAACTATTAC 6720 

AGCATTGGTA GTGGTGACTG TTGGAACATT ACCGTTCTTA GGACTAGTAA TACCAAATAT 6780 

20 TATTTCAATT TATCGAGGTG ATCATTTGAA AAATGCTATC CCTCATACGA TGATGTTAGG 6 840 

TGCCATCTTT GTATTATTTT CTGATATAGT TGGCAGAATT GTTGTTTATC CATATGAAAT 6 900 

AAATATTGGT TTAACAATAG GTGTATTTGG AACAATCATT TTCCTTATCT TGCTTATGAA 6 960 

25 AGGTAGGAAA AATTATGCGC aACAATAATA AAAAAATAAT GCTTTTAATT GCAGTAACGT 7020 

TATTAATTAG TATGCTGTAC TTATTTGTAG GTATTGATTT TGAAATATTT GAATATCAAT 70 30 

TTTCAAGTCG TTTAAGAAAG TTCATATTAA TTATTTTAGT AGGTGCTGCC ATTGCAACTT 7 14 0 

30 

CAGTGGTGAT TTTTCAAGCG ATTACAAATA ACCGTCTATT GACACCATCA ATAATGGGGT 7200 

TAGATGCAGT TTATTTATTT ATCAAAGTAT TGCCAGTCTT TTTATTTGGA ATTCAATCGG 7260 

TATGGGTTAC TAATGTATAT TTGAACTTTA TATTAACACT TATAACGATG G7GTTATTCG 73 20 

35 

CACTAATCCT ATTCCAAGGT ATCTTTAAAA TCGGACATTT TTCAATTTAT TTTATCTTAC 7330 

TTAJTGGTGT CCTTTTAGGA ACATTTTTTA GAAGCATAAC AGGTTTTATT CAACTGATTA 744 0 

40 TGGATCCTGA GTCATTTTTA GCAATACAAA GTAGTATGTT TGCTAATTTT AATGCTTCTA 7500 

ATTCGAATTT AGTTACTTTC TCAGCAGTGC TATTAGTAAT CTTATTAGTC ATTACAATTT 7560 

TACTATTGCC TTATTTAGAT GTATTGCTTT TAGGTCGTGC TGAAGCAATT AATCTTGGGA 7620 

45 TATCGTATGA AAAATTAACG CGAATT 7646 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 1194 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 



683 



EP0 786 519 A2 



w 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

ATGAATATAT TTnnAAATAA ATTATTATGG ATTGCACCAA TnGCCACTAT GATTATCTTG 60 

GTAATCTTTT CTTTAGCTTT TTATCCTGCA TATAATCCTA AACCAAAAGA TTTACCAATT 120 

GGTATATTAA ACGAGGATAA AGGTACAACG ATTCAAGATA AAAATGTTAA CATTGGTAAA 180 

AAATTAGAGG ATAAATTATT AGATAGTGAT TCTAATAAAA TTAAATGGGT TAAGGTTGAT 240 

9 

AGTGAAAAAG ACCTTGAAAA AGATTTGAAA GATCAAAAAA TCTTTGGAGT AGCTATTATT 300 

GATAAAGACT TTTCAAAAGA TGCTATGAGT AAAACACAAA AAGTAGTTAT GGATAGTAAA 360 

AAAGAAGAAA TGCAACAAAA AGTTGCTTCA GGTGAAATTC CGCCACAAGT GGTTCAACAA 420 

ATGAAACAAA AAATGGGGAA TCAACAAGTA GAGGTTAAGC AGGCTAAATT TAAAACGATT 480 

GTAAGTGAAG GATCAAGCTT ACAAGGTTCA CAAATTGCAT CAGCTGTGTT AACTGGTATG 540 

20 GGTGATAATA TTAATGCTCA AATTACGAAG CAAAGTTTGG AAACATTAAC GAGTCAAAAT 600 

GTTAAAGTCA ATGCCGCGGA CATCAATGGT TTGACGAATC CAGTAAAAGT GGATAATGAA 660 

AAACTTAATA AAGTTAAAGA TCACCAAGCA GGTGGTAATG CACCATTCCT AATGTTTATG 720 

25 CCAATTTGGA TAGGTTCAAT CGTAACGTCT ATCTTATTGT TCTTTGCATT TAGAACTAGT 780 

AACAATATCG TCGTGCAACA TCGTATCaTT GCtTCAATTG GACAGATGAT ATTTGCAGTT 840 

GTTGCAGCAT TTGCAGGTAG CTTTGTTTAT ATTTATTTCA TGCAAGGCGT TCAAAGATTT 900 

GATTTTGACC ATCCAAATCG TATCGCAATT TTTGTAGCAT TTGCGATTCT TGGTTTCGTG 960 

GGCCTTATTT TAGGTGTTAT GGTATGGCTA GGTATGAAGT CAGTTCCAAT TTTCTTCATT 1020 

TTAATGTTCT TTAGTATGCA ACTTGTAACG TTACCTAAAC AAATGTTGCC TGAAAGTTAT 10 80 

CAAAAATATG TATATGATTG GAATCCATTC ACACACTATG CAACAAGTGT AAGAGAcTAT 114 0 

TATACTTGAA tcatcatatt GAATTAAATA GTACAATGTG GATGTTTATA GGGT 1194 
40 (2) INFORMATION FOR SEQ ID NO: 123: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 553 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 



30 



35 



50 



55 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GACCGACCTA TACATCCGTA TAAGTATTTC TTGATATAAG TCTTCTAAAT CATAATGATT 60 
AAATCCAAAT GTTTTGATGC GTCGAATAAT TAATGGTTGT AGATCCATTA CTAACTTTTC 120 
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GTATTTCAAA TATTAAACTA ACCCCTTCTA TCTAAAATTT AAGGTTAGTT TAATATTGTT 240 
ACATTCAAAA TTTCAAGATG ACGGAAATGT CATTTCTTAT GATGTCCTCT TCGTATTTTT 3 00 

5 TCAAATTCTG CAAGGATTTC AGAAGATAAC GGAATTCGAG TTCTTGGCTT GTTTTCACTT 360 

ATATCATCTA ATGATTTACT CACATCAATT TCATTTTCTT TTAAATCTCT CCACATTTCG 420 
CGAGATGATA TTCTATATGC ACCTGATCCA AAGATAGCAT GTTGcTCACT CaTATCACTT 4 80 

10 

GTTACAACTG TAATATGcTT AGtATGCTTG tCaTAAAGtT CaTAAACCAT AACGGTTCTA 540 
ATGGAAACCA ATCAGCTG 559 
(2) INFORMATION FOR SEQ ID NO: 124: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7762 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 <D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



25 


GCTTCAGACA TnTGATGATA 


TAATCTCTCA 


TCATCGATTA 


ATTCTTTTGC 


AGCTTGATAC 


60 




ACATnTTGCT 


TATTTGTTCC 


AATGACTTTT 


AATGTGCCAG 


CTTCAACACC 


TTCAGGACGT 


120 




TCTGTAACAC 


TTCGCCAAAA 


CTAAAACTGG 


CTTATTAAAT 


GATGGCGCTT 


CTTCCTGAAT 


130 


30 


TCCACCTGAA 


TCTGTCAAAA 


TAAAATAAGA 


TTTTr.TAGCA 


AAATTATGGA 


AATCTATACG 


240 




TCCAAAGGTT 


CAATCAATTC 


AATTCTGTCA 


TGACTACCTA 


AAATCTTTTG 


AGCCACCTCT 


300 


35 


CGAACTTTCG 


GGTTTTTATG 


CATTGGATAT 


ACCAGTGCTA 


AATCAGTATA 


CTCATCTATT 


350 


AAGCGTCTAA 


CCGCTTTAAA 


TATATTTTCC 


aTGGGTTTCC 


CGATATTTTC 


TCGTCGGTGT 


420 




GCTGTCATrA GAATGAATTT 

• 


kTtGTCATGG 


TATTTATCCA 


TGATGTTAGA 


TTTATAATTG 


430 


40 


TCATCAACT3 


TATATTTCAT 


AGCATCAATC 


GCAGTATTAC 


CAGTGACAAC 


AACACTTTCT 


540 




GAATATTTCC 


CTTCACTTAA 


CAAATGCGAT 


GCAGCATTTT 


TAGTAGGTGC 


AAAATGTAAG 


600 




TCAGCTAATA 


CACCAACTAA 


TTGTCTATTC 


ACCTCTTCTG 


GAAAAGGTGA 


ATATTTATCA 


660 


45 


TAACTTCTAA GCCCTGCTTC 


AACGTGTCCA 


ATCGGCACTT 


GGTTATAAAA 


TGCCGCTAAA 


720 




CCACCTGCAA ATGTCGTCAT 


CGTATCACCA 


TGTACAAGTA 


CCATGTCTGG 


TTTTTCTAAT 


730 




TGAATCACTT 


GTTCTAATTG 


AGTGATTGAT 


TTAGAAGTTA 


TCTCAGAAAG 


TGTCTGTCCT 


840 


SO 


GATTTCATAA 


TATTCAAATC 


GTATTTTGGT 


TTGATTTCAA 


AGGTACTTAA 


TACTGAATCA 


900 




AGCATTTCTC 


TATGCTGTGC 


TGTAACAACA 


ACAATTGGCT 


CGAGCATTTT 


TTCTTGTTCC 


960 
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ATCTTTTTCA 


TCAAACTACT 


TATCTCCGAT 


TCTTCTATTT 


AGTACCAAAC 


AATCTATCTC 


1080 




CAGCGTCGCC 


TAACCCTGGT 


GTGATATATG 


CTTTGTCATT 


aGCTTTTCAT 


CAAGTGCAGC 


H40 


5 


AATATAAATA 


TCTACATCTG 


GATGTGCTTC ATGCATCTTT 


TCTACGCCTT 


CTGGTGCTGC 


1200 




AATTAAACAC 


ATGAAGCGAA 


TATTTTTAGC 


GCCACGTTTC 


TTCAATGAAG 


TAATAGCTTC 


1260 




AATTGCTGAT 


GCGCCTGTTG 


CTAACATAGG 


ATCAACAACA 


ATGATTTGTC 


TTTCAGTAAT 


1320 


10 


ATCTTGAGGT 


AACTTAGCAA 


AATACTCTAC 


AGCCTTTAAT 


GTTTCGGGAT 


CTCGATATAA 


1380 




ACCGATATGT 


CCAACTCTGG 


CTGCAGGTAC 


TAAACTTAAA ATACCATCAG 


TCATACCTAA 


1440 


15 


ACCAGCTCTT 


AAAATTGGAA 


CGATAGCTAA 


TTTTTTACCA 


GCTAATCGTT 


TAGCCGTCAT 


1500 
~j \j \j 


TTTAGTTACA 


GGCGTTTCAA 


TATCAACATC 


CTGAAGCTCT 


AAGTCTCTAG 


TT ACTT CAT A 


1560 




TGCCATCAAC 


ATACCAACTT 


CGTCTACAAG 


TTCTCTAAAT 


TCTTTAGTAC 


CTGTATTTAC 


1620 


20 


ATCTCTAATA TAGCTTAGTT 


TGTGTTGAAT 


TAATGGATGA 


TCGAAAACGT 


GTACTTTACT 


1680 

-j. w a \j 




CATAAAAATT ACTCCTATCT 


TTGTGTATGT 


TTATTGATAT 


AGAGGATATT 


CAGCTGTTAA 


1740 




TTTCGCAACG 


CGTTCTTTAG 


CTTGTTGTAA 


TTTTTCTTCA 


TCTTTACTAT 


TTTT CAATG C 


1800 
•x. 0 \j \j 


25 


TAAACTGATG 


ATTTTTGCAA 


CTTCCTCAAA 


AGCTTTTTCA 


TCAAATCCAC 


GCGTTGTTGC 


1860 

x y "J V 




AGCAGGTGTA 


CCTAAACGTA 


TACCACTCGT 


TACAAAAGGT 


TTTTCTTGAT 


CGAACGGAAT 


1920 




GGTATTTTTG 


TTACATGTGA 


TACCAACTGA 


ATCTAAAGTC 


TCTTCAGCTT 


CTTTACCAGT 


1980 


30 


AAGTCCTATA 


GACCCTTTTA 


CATCAACAGC 


TACTAAGTGA 


TTATCTGTAC 


CGCCAGAAAC 


2040 




AATTCTAAAT 


CCTTCATTAA 


TTAATGCTTC 


TGCAAGAACT 


TTTG CG TTTT 


T AA C CACTTG 


2100 




TTGTTGATAC 


GTTTTGAAAT 


TATTTTCTAA 


CGCTTCTCCA 


AAAGCAACTG 


CTTTtGCTqC 


2150 


35 


AATAACATGC 


TCAAGAGGTC 


CACCTTGAAT 


ACCAGGG AAA 


ATTGTTTTAT 


CTATGTCTTT 


2220 




TTTATATTCT 

* 


TCCTTACATA 


AAAT CATACC 


ACCACGtGGT 


CCGcGTAATG 


TTTTGTGTGT 


2280 


40 


TGTAGTTGTT 


ACAAAATCAG 


CATATTCTAC 


TGGATTTGGA 


TGTAAACCTG 


CCGCTACTAA 


2340 


TCCTGCAATA 


TGTGCCATGT 


CTACCATTAA 


CTTAGCGTTT 


ACTTCATCTG 


CGATTTCTTT 


2400 




AAACTTTTTG 


AAGTCAATTG 


TTCTTGAATA 


TGCTGATGCT 


CCTGCCACAA 


TAAGCTTAGG 


2460 


45 


CTTATGCTCT 


AACGCTAATT 


TACGAACTTC 


ATCATAATTG 


ATTCGTTCTG 


TGTCTTTATC 


2520 




TACTCCATAT 


TCAACGAAAT 


TGTAGAATTT 


ACCACTAAAA 


TTAACAGGCG 


CTCCATGTGT 


2580 




CAAGTGACCA 


CCATGACTCA 


AATTCATACC 


TAAAACTGTG 


TCGCCCATrr 


CTAATGCAAC 


2640 


50 


TAAGTAAACA 


GCCATGTTCG 


CTTGTGAACC 


TGAATGTGGT 


TGAACATTGA 


CATGTTCAGC 


2700 




TCCAAACAAT 


GLT1TAGCAC 


GATCAATTGC 


GATGCTTTCA GTAACATCTA 


CAAACTCACA 


2760 
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TTGTGCTTCC ATAACCGCTT 


CCGATACAAA ATTTTCCGAT 


GCGATTAACT 


CTATGTTGCT 


2830 




ATTTTGTCTC 


TGAAATTCTC 


TCTCGATTGC 


TTCTGCGATA ACTTTATCTT 


GCTTGGTGAT 


2940 


5 


ATAAGACATA AAATCTCCCC 


TTCTTTCAAA 


AAAACTTATT 


GGTATTTAGC 


ACGTTCGCCA 


3000 




CCAATCTTTT 


TCGGCCTAGA 


TGTGGCAATA 


GTTACAATTG 


CCTGTCCTAC 


TTGCTTTACT 


3060 




GAGGTCCTTA 


CAGGTACACA 


TACATGTTTA 


ATATGCATGC 


CTATTAACGT 


TTGACCAATA 


3120 


10 


TCAATTCCAC 


AAGGAACAGT 


AATATGTTCG 


ACCACGATCG 


GATCCTTCAT 


ATGCTGAAAA 


3180 




GCGTATGTTG CCAAACTCCC TCCAGCATGT ACATCTGGAA CGACGGAAAC TTCTTCCATT 


3240 


15 


GTT AATGGAT 


TATACTGAGA 


TTTTTCTATT 


GTTATCGCTC 


TGTTGATATG 


TTCACATCCT 


3300 


TGAAAAGCAA AAGTAACGCC 


TGTCTCTTTA 


CTCACAACAT 


CTAATGCATT 


AAAAATAGTT 


3350 




TCTGCAACTT 


CCaTCGAACC 


GACAGTCCCT 


ATTTTTTCGC 


CAATGACTTC 


CGATGTTGAA 


3420 


20 


CATCCAATTA 


AACATATATC 


TCCTTTATTA 


AAAAAGGACA 


TATCTTTTAA 


TTCGTCTAAT 


3430 




AACATTGTCA 


AATCTTTCAT 


AAAAGCCCAC 


CCTTCCTAAA 


AATAAAAAAG 


GAATATAGCA 


3540 




AAGTGCTACA 


CTCCTCTATT 


ATAACTTATT 


TAACTGTTAA 


CATATACTAA 


TTATACAGAA 


3600 


25 


TTCCTACTAG 


CAAATAATAT 


CTTTTAATTT 


TAAAATTAAA 


CTTACAAGTT 


CTTCATAGGT 


3660 




ATGTACATAC 


ATTTCTTTTG 


TTCCACCGTA 


TGGATCTATA 


ACTTCTCCTG 


CTTCTTTCAC 


3720 




ATATTCATGC 


AATGTGAAAA 


CATGATTTTG 


CAAACCAAAG 


TGTGCCTCTA 


TTAATTCTTT 


3780 


30 


GTGCGAATAC 


GACATCGTCA 


AAATAATATC 


TGCTTTCAAA 


TCTGCTTCAG 


7AAATTGTTG 


3840 




CGATAAGGTC 


GTTTCAGCTA 


AATGATGTTC 


TTCAACTAAG 


TCTTCAACAT 


AATTCGAAAC 


3900 




ACCTTGATTG 


TTCACAGCGA 


ATATACCTCT 


TGATTCAAAT 


TGATGATT7G 


GCATAACCTC 


3960 


35 


TTTTGCAATA 


CTTTCCGCTA 


ATGGGCTACG 


ACATGTGTTA 


CCTGTACAAA 


CGAATAAAAT 


4020 




CTTCATAGTT 

• 


CACATCCTTT 


AATAATGTGA 


TTACCTGCAG 


CTTTTAACAT 


GCGATTCATA 


4080 


40 


ATTG CTTCTG 


TATTATCATT 


CAGCTCAAAG 


CCGTATATAT 


ACGCCGCTGA 


AATATTTTCA 


4140 


TTTTCATCAA 


GTGAATGTAA 


CACATCATAA 


AGATTATGAC 


TTGCTTGTTT 


AACATCATTG 


4200 




TCATCCTGAC 


ATAATTGAAT 


GAATTGCGCT 


TCACTTGGTA 


TAAACGCCAC 


CTTATTACTC 


4260 


4o 


GGCACAATAA 


AAGCTATAGA 


AGACCAATCT 


TTACCGTCAT 


TTCCAATTTT 


GCTCTCAATA 


4320 




TCTGTAATAA 


TTGTAAGTGG 


TGTATTGGGT 


GAGTAATGCT 


TATACTTCAT 


ACCTGGTGCA 


4380 




ATTGGCTGTT 


CAGTATCATT 


ATAATCAGCA 


TGGGCGATAC 


TATTCGGAAG 


TATTTCTGTA 


4440 


SO 


ATCATTGCTG 


CTGTTATAGA 


ACCAGGTCTT 


GCAATTTTAT 


AAGGAAAAGA 


TGTGCAATCT 


4500 




AAAACCGTAC 


TTTCTAATCC 


TTCTTCACTT 


TGTTCAGCTT 


GAACAATACC 


ATCGATACGG 


4560 
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GCACTTGGAG 


CAGCTAGAGG 


TTCATTTATG ATTTGTAATA ATTGTCTACC TACAGAATGG 


4680 




CTTGGCATTC 


TAACAGCAAC 


TGATGATAAA 


CCTCCAGAAA 


CTTTTCGACA 


TAGATAGCCT 


4740 


5 


AGCTTTAACG 


GCAATATAAA 


CGAAATAGGG 


CCCGGCCAGA 


ATGCCTGCAT 


TAACTTTTCT 


4S00 




ACGCGTGGAT 


CCAAAGTATA 


TGTAAAATCT 


TTTAATTGAC 


CTTTACTGTG 


TATATGAACA 


4860 




ATAAGCGGAT 


TGTCAGATGG 


ACGGCCTTTA 


GCTTCATATA 


TTTTAGCTAC 


AGCTTCTTCA 


4920 


10 


TCTGTCGCAT 


TTGCTGCAAG 


TCCATAAACT 


GTTTCAGTTG 


GTAAACCTAT 


TAAACCACCG 


4980 




TTTAAAACAA 


TGTCTTTTAT 


TTCATTAATT 


TTAGGATATT 


GCTGTAAATC 


TTCATTATAT 


5040 


15 


TCTCTAACAT CCCAAATTTT AGTATCCAAC TTAATCACGC CTTTCTTATT 


TATCATAATA 


5100 


TAAAGCAAAA AGCTATGCAC 


TTAACTAATC 


ATAGCAAAGG 


CATAACTTCT 


AATTACCATT 


5160 




TAAATGAGAC 


GATTCGATCG 


TGGCCATTTA 


TATCTTTAAT 


AATGTCGATT 


TTTTTGTCAG 


5220 


20 


GAAATTTATT 


TAAAATTATT 


GATTTAAGTG 


CCTCACCTTG ATTGTAACCA ATTTCAAAAA 


S280 




CAACTGGGCT 


GCCTTTTTCC 


ATAACGTGAG 


GTAAATCTTC 


AATGATTGAT 


TCATAAATAG 


5340 




CATATCCATG 


GTTATCTGCA 


AACAATGCCT 


GATGTGGTTC GAATCTCGTA ACCGTTGGAG 


5400 


25 


ACATCGTAAC 


CATATCTTTT 


TCATCTATAT 


ATGGTGGATT 


AGATATCAAG 


CCGTTCAACT 


5460 




TGATACCTTC 


ATTAATTAAG 


GGCTTTAATG 


CATCCCCTGT 


TAAAAATTGT 


ATTTGTGATT 


5520 




GATGCTTCTC 


AGCATTATTA 


CGAGCCATAT 


TCATTG CTTC 


AAGTGAAATA 


TCAGT AG CAA 


5580 


30 


TAACATTTAA ATCCGGCT7T 


TCACATTTCA 


AAGTAATTGC 


AAGTACACCA 


CTACCCGTTC 


5640 




CGATATCTAC 


GATTGTTGCA 


TCATCTTCTA 


ACTGTTGTAA 


GAAATGCAAC 


ATTACTTCTT 


5700 




CAGTTTCAGG 


TCTTGGTATC 


AAACAATTTG 


AGTTTACATC 


AAACGTTCTA 


CCATAAAATG 


5760 


35 


AGGCAAAGCC 


AACTATATAC 


TGTATAGGCT 


CTCCTAATAA 


CATACGTTGT 


AATGCTAAGT 


5820 




CGAACTTCAT AATCATCGCT 


TTCGGCATAT 


CATCATGCAT 


GTGGACTACA 


AAGTCCGTAC 


5880 


40 


GCGTCCATTG 


AAATACATCT 


AACATTAACC 


ATTCAGCTCG 


TGTTTGTTCA 


AACCCTTTTT 


5940 


GTTGTGTTAA 


ATGAATTGCT 


TCATCTAACT 


TTTCTTTATA ATTCACCATT 


ATTAAGTTCT 


6000 




TTCAATTTAT 


CTGTCTGCTC 


TGATAAAGTC AGTGCATCTA TAATTTCTTC TAAATGGCCT 


6060 


45 


TCCATAATTT 


GCCCTAATTT 


TTGAAGCGTT 


AGACCTATAC 


GATGGTCTGT 


TACACGGCTT 


6120 




TGTGGATAAT 


TATAAGTTCG 


AATACGTTCT 


GAACGATCAC 


CAGTACCGAC 


TGCTGATTTA 


6180 




CGTTGTGACG 


CATACTTTTG 


TTGTTCTTCT 


TGAACTTTCA 


TATCGTATAA 


ACGTGCTTTT 


6240 


SO 


AACACTTTCA 


TTGCTTTTTC 


ACGGTTTTGA 


ATTTGAGACT 


TCTcAGAAGA 


TGTTGCAATG 


6300 




ACACCAGTTG 


GTAAATGGGT 


AATACGTACT 


GCAGAGTCAG 


TTGTGTTTAC 


GTGCTGACCA 


6360 



55 



688 



10 



15 



20 



EP 0 786 519 A2 

ACATCTTCAA CTTCTGGTAA AACTGCCACT GTAGCTGTTG AAGTATGAAT ACGTCCACCT 6480 

GATTCTGTTT CAGGCACACG TTGAACGCGG TGCGCACCAT TTTCAAATTT CAATTTACTA 6540 

TACGCGCCAT TACCAGAAAC TGAGAAACTA ATTTCTTTGT AACCACCATG GTCACTTTCA 6600 

GACGCTTCTA CTATTTCAGT TTTGAATCCT TGTGATTCAG CATACTTTGA ATACATACGC 6660 

ATTAAATCAC CAGCAAAAAT CGCAGCCTCA TCACCACCTG CTGCTGCTCT TATTTCTACA 6720 

ATAACGTCTT TGTCATCATT AGGATCTTTA GGAATCAATA ATATTTTAAG CTCTTCTTCA 6780 

AGATTTGGAA GTTCAGCTTT AATACCATTA CTCTCCTCTT TTAACATTTC TACTTCTTCT 6840 

TTATCATCAG TCTCACTTAA CATTTCTTCA ATATCAGCTA ATTCTTCTTT TTTAGCTTTA 69Q0 

TAGTTACGAT AAACATCTAC AGTTTTTTGT AAATCAGCTT GCTCTTTAGA ATATTTACGT 6960 

AATTTATCTG AATCATTTAC AACATCTGGG TCACTTAACA GTTCATTTAA CTGTTCGTAT 7020 

CTTTCTTCTA CAATATCTAA TTGATCAAAC ACTTATAATT CCTCCTTATT ATTATCACTA 7080 

GGTGCTACGA TATGGTGCGC GCGACAACGT GGCTCATAAC TTTCATTGGC ACCTACTAAG 7140 

ATAATCGGAT CATCGATTTT AGCTGGTTTA CCATTTATTA ATCGTTGCGT TCTACTAGAT 7200 

25 GAAGAACCAC AAACAGCACA AACTGCTTGA AGTTTCGTTA CTTGTTCACT GACAGCCATC 7250 

AATTTAGGCA TTGGTTCGAA CGGTTCGCCC CTAAAATCCA TATCTAATCC AGCAACAATA 7320 

ACACGGTGTC CATCTGCTGA TAGTTTTTCT ACTATACTTA CAATTTCATC GTCAAAAAAT 73 30 

TGCAcTTCGT CTATTCCTAT AACATCAACA TTAGTTAAGT CGTGCGTCAT AATTTCACTT 7440 

GCTTTAGAAA TATTAA7CGC TTCAATGGCA TTACCATTAT GAGAGACCAC TTTTTCTTTA 7500 

TGATATCGAT CATCAATCGC CGGTTTAAAT ACAACGACTT TTTG7T7AGC GTATATACCC 7560 

CTTCTTAGAC GTCTTATTAG TTCTTCGGAT TTACCGCTAA A CAT ACT AC C TGTAATACAT 76 20 

TCTATCCAAC CGGAATGGTA AGTTTCATAC ATTGAGAGTn CCACCTTTTT CAAAACATAA 76 30 

* 

TCGCTTTATT ATATCATATT TCAAATATTC ATAAATGTCT TTnTCATAAT TATATCGATA 7740 

TTGTACATGA ACAATTATTT TA 7762 
(2) INFORMATION FOR SSQ ID NO: 125: 

4$ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2583 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
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TAAAAAAATT ATTATCAATG ATGAACTAGA ATTGACTGAA TTCCACCAAG AACTTACTTA 120 

TATTTTAGAC AACATAnAAG GGAATAATAA TTATGGTAAG GAATTTGTTG CAACCGTTGA 180 

AGAAACATTC GACATTGAAT AaAGCGGGGT GgaAGCACTA TGAATCAATG GGATCAGTTC 240 

TTAACACCTT ATAAGCAAGC GGTTGATGAG TTGAAAGkGA AcTTaAAGGC ATGCGCAAAC 3 00 

AATATGAAGT TGGTGAACAA GCGTCGCCAA TAGAATTTGT TACTGGTCGT GTTAAACCAA 360 

TCGCTAGTAT TATAGATAAG GCAAACAAAC GACAAATACC ATTTGATAGG TTAAGAGAAG 420 

AAATGTACGA TATCGCTGGT TTAAGAATGA TGTGCCAATT TGTTGAAGAT ATTGATGTTG 480 

TCGTCAATAT TTTAAGACAA AGAmAAGATT TTAAAGTAAT TGAAGAACGA GATTATATTC 54 0 

GTAACACTAA AGAAAGTGGT TACCGCTCGT ATCATGTCAT TATTGAATAT CCAATTGAAA 600 

CATTACAAGG CCAAAAATTT ATATTGGCTG AGATTCAGAT TCGTACATTA GCAATGAATT 660 

TCTGGGCAAC GATTGAACAT ACTTTACGAT ATAAATATGA TGGTGCTTAT CCGGATGAAA 720 

TTCAACATCG TTTGGAAAGA GCGGCAGAAG CAGCGTATTT ACTTGATGAA GAGATGTCTG 780 

AAATTAAAGA TGAAATTCAG GAAGCTCAAA AATATTACAC GCAAAAACGT TCTAAAAAAC 84 0 

25 ATGAAAATGA TTAACGAGGT GTTATAAATC ATGCGTTATA CAATTTTAAC TAAAGGTGAC 900 

TCCAAGTCTA ATGCCTTAAA G CAT AAAATG ATGAACTATA TGAAAGrTTT TcGCATGaTT 96 0 

GaGGATrGTG AAAaTCCTGA AATTGTTATT yCAGTTGGTG GTGACGGTAC ATTACTACAA 1020 

30 GCATTCCATC AGTATAGCCA CATGTTATCA AAAGTGGCAT TTGTTGGAGT TCATACAGGT 108 0 

CATTTAGGAT TTTATGCGGA TTGGTTACCT CATGAAGTTG AAAAATTAAT CATCGAAATT 114 0 

AATAATTCAG AGTTTCAGGT CATTGAATAT CCATTGCTTG AAATTATTAT GAGATACAAC 1200 

GACAACGGCT ATGAAACAAG GTATTTAGCA TTAAATGAAG CAACGATGAA AACTGAAAAT 1260 

GGCTCAACAC TTGTTGTGGA TGTTAACTTA AGAGGGAAAC AC7TTGAGCG ATTTAGAGGC 1320 

GATGGATTAT GTGTATCAAC ACCTTCGGGT TCAACGGCTT ATAACAAAGC GCTAGGTGGC 1380 

GCACTGATAC ATCCTTCACT TGAAGCAATG CAAATTACAG AAATTGCCTC GATAAATAAT 1440 

CGTGTGTTTA GAACGGTAGG ATCACCACTT GTATTACCAA AG CAT CAT AC ATGTTTAATA 1500 

TCACCAGTTA ATCATGATAC CATTAGAATG ACGATAGATC ATGTTAGTAT CAAACATAAA 1560 

AATGTTAATT CAATACAATA CCGTGTAGCA AATGAAAAAG TGAGGTTTGC ACGTTTTAGA 1620 

CCATTCCCAT TCTGGAAACG TGTGCACGAT TCTTTCATAT CAAGTGATGA AGAACGATGA 1680 

50 AATTTAAGTA TCATATATCA CAACAAGAAA CTGTTAAAAC TTTTTTAGCA CGACATGATT 1740 

TTTCTAAGAA GACAGTGAGC GCCATTAAAA ATAATGGCGC TTTAATTGTT AATGATGAAC 1800 
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AAATACCGAG TGTTAATTTA ATACCTTATG CTCGTAAGCT AGAAGTATTG TATGAAGATG 1920 

CTTTTATCAT CATAGTTACT AAACCAAACA ATCAAAATTG TACGCCTTCG AGAGAACATC 1980 

CTCATGAAAG TTTAATCGAA CAAGTACTAT ATCATTGTCA GGAACATGGT GAAAATATTA 2040 

ACCCACATAT TGTTACGCGT CTAGATCGTA ATACAACTGG TATTGTGATA TTCGCTAAAT 2100 

ATGGACATAT CCATCATTTA TTTTCTAAAG TAAACTTGAA AAAAATATAT ACTTGCCTTG 2160 

TATATGGTAA AACCCATACA TCTGGTATTA TTGAAGCTAA TATTAGACGG TCAAAGGATA 2220 

GGATTATAAC TAGAGAAGTT GCCTCGGATG GTAAATACGC TAAAACATCT TATGAAGTAA 2280 

TAAATCAGAA TGATAAATAC AGTTTATGCA AAGTTCATTT GCATACGGGA CGTACACATC 2340 

AAATTCGTGT ACATTTTCAA CATATTGGGC ATCCAATTGT GGGAGATTCT TTGTATGATG 2400 

GTTTTCATGA CAAAATTCAT GGTCAAGTAC TGCAATGTAC GCAAATATAT TTTGTTCATC 2460 

CAATCAATAA GAACAATATT TATATTACAA TTGATTATAA GCAATTACTT AAATTATnCA 2520 

ATCAACTCTA ATnCACACAG GGGGTGTAAG TATGTCAATG AnCACAGATG AAAAAGAGCG 2 58 0 
TGT 

25 (2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1818 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 

30 (D) TOPOLOGY: linear 



15 



20 



2533 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 
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ATCAAGTGAT 


ACATTTAACT 


GGTAAAGGAT 


TAAnAGATGC 


TCAAGTTAAA 


AAATCnGGAT 


60 


ATATACAATA 

m 


TGAATTTGTT 


AAAGAGGATT 


TnACAGATTT 


ATTnGCAATT 


ACGGATACAG 


120 


TAATAAGTAG 


AGCTGGATCA 


AATGCGATTT 


ATGAGTTCTT 


AACATTACGT 


ATACCAATGT 


180 


TATTAGTACC 


ATTAGGTTTA 


GATCAATCCC 


GAGGCGACCA 


AATTGACAAT 


GCAAATCATT 


240 


TTGCTGATAA 


AGGATATGCT 


AAAGCGATTG 


ATGAAGAACA 


ATTAACAGCA 


CAAATTTTAT 


300 


TACAAGAACT 


AAATGAAATG 


GAACAGGAAA 


GAACTCGAAT 


TATCAATAAT 


ATGAAATCGT 


360 


ATGAACAAAG 


TTATACGAAA 


GAAGCTTTAT 


TTGATAAGAT 


GATTAAAGAC 


GCATTGAATT 


-420 


AATGGGGGGT 


AATGCTTTAT 


GAGTCAATGG 


AAACGTATCT 


CTTTGCTCAT 


CGTTTTTACA 


480 


TTGGTTTTTG 


GAATTATCGC 


GTTTTTCCAC 


GAATCAAGAC 


TTGGGAAATG 


GATTGATAAT 


540 


GAAG-I-iTATG 


AGTTTGTATA 


TTCATCAGAG 


AGCTTTATTA 


CGACATCTAT 


CATGCTTGGG 


600 
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CTCATGTTAA AGCGCCACAA AATTGAAGCA TTATTTTTTG CATTAACAAT GGCATTATCT 
GGAATTTTGA ATCCAGCATT AAAAAATATA TTCGATAGAG AAAGACCTAC ATTGCTGCGT 
TTAATTGATA TAACAGGATT TAGTTTTCCT AGCGGTCATG CTATGGGATC AACTGCATAT 
TTTGGAAGTG GTATCTATCT ATTAAATCGA TTAAATCAAG GTAATTCAAA AGGTATTCTT 
ATAGGGTTAT GTGCAGCTAT GATTTTATTG ATTTCCATAT CACGTGTATA TCTAGGTGTA 
CATTATCCAA CAGATATTAT TGCCGGCATT ATTGGTGGAT TATTTTGcAT TATTTTATCA 
ACGTTATTAC TTAGAAATAA ATTAATAAAT TAAATAGTAA AAAAACAAAA GCAGTAAACC 
TAAAGTGTCG TAAGGGTTTA CTGCTTTTAT AAAACGTTG7 TATAACGTAT ATTGTCTTTT 
ACGGGCATAT AAnAGGGGAA TATTTGAnAA TGACCAATCC AACAAGAACG AAACGTTGTG 
GGGGGGATGT TCTATGTGGT ATTGATAATC ATTTTCAACT ACTATTATAC ATTAGTGAGA 
ATCATTGTCA ATTAGAAACT AAAACTTTTT TTGAATATTT TTTAAGAATA GTAAATAAAA 
CGCATGATTA CGCTATTTTA GAAAATAAAA AAATTTGTAT TTCTCATTAG AATTAGAATA 
TTTAAAAGTG ATGAGGTTTA AACATTATAT TGTTTACATA CTCCTTTTGA ATTCATACAT 
TATGAAATGT tACTTCCAAG TTCAAAATCG CACATTGAAA TGATGTGTGA AATGTTTAAA 
CTACGGTCAT tTTGTGmAAA TAAAGrTAAT AACTATTCAT TTTACAATAG TGAAAAGTCA 
GTATATGACA ACAATTAATA TTGCGGTAAG GCCTTGTGTT ACAGTATTCT ATATTTAAGT 
ACTGCAATCA GAATTAACAG AATGCCATTA ACTGATTATT AAATATTTGA GTTAATAAAT 
AATTAATGAT TGTAGCTTGA AAAATTTAAA ACATGGTTAT TGATTTGTGA TAAAATTTAA 
ACGTAAACAA ACTAATTTAA AAAGCAACTA TTGTATAGAA AAATACAAAA TTTAAAATAT 
ATTACCTTAT TAGAAAAA 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12658 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 127: 
TGTTTAAACA ATAGGGGGAA TCTTATGATT GAAAAATTAG TAACCTTTTT AAATGAGGTT 
GTTTGGAGTA AGCCATTAGT TTATGGTTTG CTAATTACTG GTGTGCTATT TACATTGCGT 
ATgCGATTTT TTCAAGTTAG ACATTTTAAA GAAATGATTC GATTAATGTT TCAAGGAGAG 
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GGTACAGOTA ATATTGTCGG TGTATCTACT GCAATATTTA TAGGAGGACC TGGTGCAGTA 300 

TTTTGGATGT GGATTACTGC GTTTTTAGGT GCAAGTAGTG CTTTTATTGA ATCTACACTT 360 

GGTCAAATAT TCAAGAGAGT TGAAAATAAT GAATACCGTG GTGGACCAGC GTATTATATT 420 

GAATATGGTA TTGGTGGTAA ATTTGGTAAA ATTTACGGAA TTATCTTTGC TATTGTTACG 480 

ATTATCTCAG TAGGTCTATT GCTTCCTGGT GTGCAATCTA ACGCTATAGC AAGTTCTATG 54 0 

CATAATGCGA TTCATGTTCC ACAATGGTTA ATGGGTGGTA TTGTTGTAGT TATTTTGGGA 600 

TTAATTATTT TTGGTGGTGT ACGTATTATT GCCAATGTTG CAACAGCCGT TGTACCATTT 660 

ATGGCAATTA TTTACATACT GATGGCTGTC ATTATCATTT GTATCAATAT ACAAGAAGTG 720 

CCAGCGTTAT TTGCATTAAT TTTCAAATCA GCATTTGGAT TACAATCTGC TTTTGGTGGT 780 

ATCGTTGGCG CAATGATAGA GATTGGTGTT AAACGTGGAT TATATTCAAA TGAGGCTGGT 840 

20 CAAGGTACAG GTCCACACGC AGCAGCGGCa ■ gcAGaAGTAT CACATCCAAG TAAACAAGGT 900 

CTAGTACAAG CATTTTCAGT TTATATTGAT ACATTATTTG TATGTACTGC AACTGCTCTG 960 

ATTATACTTA TTTCTGGTAC ATATAATGTG ACTGATGGTA CGGTTAATGC GAATGGCACA 1020 

25 CCGCATTTAA TTAAAGATGG CGGTATTTAT GTTgAAAATG CAACAGGTAA AGATTATTCA 1080 

GGTACTGCGA TGTATGCACA AGCCGGCATt GATAAAGCGT TCCATGGCAG TGGTTATCAA 114 0 

TTTGATCCTA CTTTCTCTGG CGTAGgTTCG TACTTTATTG cATTTGCTTT ATTCTTCTTT 1200 

GCATTTACTA CAATTTTGTC GTACTACTAC ATTACAGAAA CAAATGTTGC TTATTTAACG 126 0 

CGTAATCAAA ATAATCAAGT TTCATCGATA TTTATTAATA TTGCTCGTGT GATTATTTTG 132 0 

TTCGCTACAT TTTACGGTGC AGTTAAAACA GCTGATGTAG CATGGGCATT CGGTGATTTA 13 80 

GGTGTAGGTC TAATGGCTTG GTTAAATATC ATTGCGATTT GGATTTTACA TAAGCCTGCC 1440 

GTAAATGCTT TAAAAGATTA TGAAATTCAA AAGAAACGTT TAGGCAACGG TTATAATGCA 1500 

GTTTATCAAC CTGATCCGAA TAAATTACCT AATGCTGTCT TTTGGTTGAA GACGTATCCA 1560 

GAACGTTTAA AACAAGCACG TGCCAAAAAG TAATCTACTT TTGTTTATAG TATATGTAGT 1620 

GATCATTTGA TAAAAAAGAA AAGTATTGAG AATTTTAGGt GCTCAGAAAT TTGAATTTTA 1680 

45 AAAATATAGT GTCTCTTGGT ACAATAACAA TACAACTACT AGGGGCACTT TTTTATGTCA 1740 

GAATTTAAAA CTGGTAAGAT TAATAAACAT GTTTTATATA GTAATATTTT AAATAGAGAT 1800 

GTCACGTTAA GTATTTATTT AC CAGAATCT TATAATCAAC TTGTTAAATA TAATGTCATT 1860 

so CTTTGCTTTG ACGGATTAGA TTTTTTACGT TTCGGGAGAA TACAACGTAC ATATGAATCG 1920 

TTAATCAAAG AAGCGCGTAT TGATGATGCG ATCATTGTTG GATTCCATTA TGAAGACGTT 1980 
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GTCGGTAAAG AAATATTGCC ATTTATTGAC TCGACGTTTT CTACACTGAA AGTAGGTAAT 
GCAAGGTTAT TAGTAGGGGA TAGTTTAGCG GGTAGTATTG CCTTATTAAC GGCGTTGACC 
TATCCAACGA TTTTTAGTCG TGTAGCAATG TTAAGTCCAC ATTCAGATGA AAAAGTATTA 
GATAAGCTAA ATCAATGTGC AAATAAAGAA CAATTGACAA TTTGGCATGT CATTGGTCTA 
GATGAAAAAG ATTTTACTTT ACCAACAAAT GGTAAGCGTG CCGATTTCTT AACACCGAAT 
AGAGAATTAG CTGAACAAAT TAAGAAATAT AATATAACTT ATTATTACGA TGAATTTGAT 
GGTGGTCACC AATGGAAAGA TTGGAAACCA TTGCTGTCAG ATATATTATT GTATTTTTTA 
AGTAAAAACA CAGATGATCA ACTTTATGAA TAATTTACAT TAGTAGATTT AGTATGAATT 
GTCTTCATAT AGTCTGGTCT ATAATATAAT TTATAAAAGA TTTTACTGTT TAATTTAATT 
TAAATTTGAC GAAATTGCAA AAGATGTATA ATGAATTATT TTTAATGTAA CGGTTTTCAA 
AGAAATTTGA TATAATAGCA ATAGGTTAAA CAAAGGAGGA ATTCAGATGA TTTTAGGATT 
AGCATTAATT CCATCAAAGT CATTTCAAGA AGCGGTGGAT TCTTACCGTA AAAGATATGA 
TAAACAGTAT TCACGAATTA AACCACATGT GACAATTAAA GCGCCATTTG AAATTAAAGA 
TGGTGATTTA GATTCTGTCA TTGAACAGGT TAGAGCTCGT ATTAATGGTA TACCAGCAGT 
AGAAGTTCAT GCTACAAAAG CTTCTAGCTT CAAACCAACG AACAATGTGA TTTACTTTAA 
AGTTGCGAAG ACGGACGACT TAGAAGAATT GTTTAATCGC TTTAATGGAG AAGATTTCTA 
TGGAGAAGCT GAACATGTTT TTGTGCCACA CTTTACAATA GCACAAGGAC TATCTAGCCA 
AGAATTCGAA GATATTTTTG GTCaAGTAGC ATTAGCTGGG GTAGACCaTA AAGAAATTAT 
CGATGAATTA ACTTTGTTAC GTTTTGACGA TGACGAAGAT AAATGGAAAG TTATTGAAAC 
GTTTAAATTA GCTTAAGTAA CATAATAGTA TTGTTAATCG TAGTATGTTT GAATTAATAA 
GAAAATGGTC ATTTTTATTG AATGTAATAA AAATGACCAT TTTCTTTATT TTAAAATACG 
TTTTAACCTT ACTTAGCTTT TTCTCTATTT ACTATAAAGT rGCTTCCATA AAATACAGCT 
AAGACTAAAA AG ATTAATG C CGAGAAATAA AATGTATTGT TTAAATTGTT GGTAAATTGT 
GTAATTAATC CGCCAAATAA TGGCCCTATC ATTGAGCCGA ATCCTTGGAT ACTATTAAAA 
ACACCCCAAG TTTCTTCTTG TTCATCTGAT TTGATAAATC GTGCCATAAA GGTATTCCAT 
GCTGGTAATA AGATGCCATA CATTAGACCG ATAGCTAAAG CGATAATCCA CAAGATGTGA 
ATATTAACAA TCATAGATAG AGTAAAAATT AATATCATGT ATAAAATAAA TCCGCTTAGA 
ATAACACCAT ACATAAAGTT TCTGCTGCGG TTATCTATTA GTTTCGATAA AAAT AG CATC 
GAAACTGCAC AGCCGATACC ACCAATAATG ATTGCAACAG TATATTCAAT TGTGCTTACG 
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TGTAAAAGAA TACCAGGGAA CaACAATAAA 
AATTGAGCTT TAACTGGACG AGTATTATAA 
5 AATATCCATG CAATTAAAAC GACTAAAGAC 

TTGATAAGTA GATTCATAAA AACCATACCT 
ACATAGCCCA TTTGTTTGCC ACGTTTATCT 

10 

ATAGGACTAA CTGCAATACC GAGCATCATA 
GGAAACCAAA TAACTAAAAA TAAACTTGTA 
ATTTTTGTGC CGAATTTTTT CAGTAAAAAT 

15 

ATAAAATGTA TTGAAAATGC TAGAGACGTT 
AAGAAATTAA TATAGCTTAG GATATACATG 

2Q ATAAGCaTTA AAATGAAATT TTTATGATTA 

. TAAAGGAACC TTTCCATAAA TCTCTTGTGG 
GTCTCGACAT ATTGTCTGTG TAGCATACTT 

25 GTTAGTTAAT TGCTCATTAC CGTTAGTTAA 

AGTATCAGCG ATTTTACCAA AACCTTTTTC 
ACCAGGTGCA GGATTTAGGA AAATCATTGG 

30 GATACCACCA GGTTTCGTAA TCATAAGTTG 

GGTATAACCT AGAATCAATA CATTCTCGTT 
TAGCTCTTTG CTCTTACCAC AAATCATAAC 

35 

ATCAGTAATC ATCGTGTCAA AACCTTTAGA 
AGTTTGCTTA TCTGGATCTA AGTTGTTGTC 
TTCAAATTTG TTATCAATAG GAATACCTGT 

40 

GTCTATGAAG TCTTGTTTCG TTTCTTTTGT 
AATCCAGTTT TTATGTAAGC GATAGTCTGT 
AAATTGCTCA GTTAGTACCG ACATAACTGG 

4o 

CTTTTCTTTT ATCAATAAAT TAATTAACTT 
GTCTAGTTTA TCTGGGCGGC TGTAATAAAA 
so GCTATTGATA TACCATTTTT TACAAATAGA 

ATCGTGCTCA ATGACGCTTA AATGGTCTAG 



TGGcGCTTTG TCACATCAAC AATTTGTCTC 3900 

TTTGTTAACT TTACATCGAC AAAATAATAT 3 960 

ATCATGAAGG CAAAGCGTGT TGGGTGCACT 4020 

ACCAATAGGC CTAACAACCA TGAAAAATAA 4090 

TCTTCAACAC TGGATAACAT AATGACCCAA 4140 

GCACTAAATA TGATTACAAA AGGTGATGCT 4200 

AATGCTAAAA TAAATCCAGT CGTTAAAACG 4260 

CCTATAACAA AGTTTGTAGA TGCATCAGCA 4320 

ATTGCTACAG CAATGGATGT AACTGTTGGC 4380 

CCTCTCGCAA ATTCCATTAA AAATAAGATA 4440 

GCGTAATTAT TTAACGAAGA ATCTTGCATA 4500 

TTGTGATGAA TGACCGATTA AATCAAGTAA 4 560 

AATTTTATCT TGTTCCATTG TACTAATCAT 4620 

ACTTGCTACA ATTTTTATTG CTTCTTCTGG 46 BO 

TTCAAAGTAA AGGGCATTTT CAAGCTCTTG 4 74 0 

AATACAACGG GCGAAACCTT CAGTTATTGT 48 00 

ACTTGATGCC ATCCATTCAT TCATGTGTTT 4 8 60 

AGATTTAAAC TTAGCTGTTA AAGAACGCTT 4920 

TACTTGTGCA TTTGCaCTTT tCGCTAATAT 4 980 

TACACCAAAT GCACCAGCTG aCATTAAAAT 5040 

TATTAACCAC TGCTTTTGAT TAATAGGCGT 5100 

CaCTTTAACT GTTGAAGGAT CAATACCTAC 5160 

TGCCACATAA TATCTTGTTG AATACGGCGT 5220 

CATCACTGTA GCAACTGGAA TATTAATGTT 5280 

TGTAGGAAAC GTTAATAATA TTAAATCTGG 5340 

ATTAAGTCCA TAGTATTTGT AAAAACATTT 54 00 

CCCTTTGTAC ATATTTCTAA AATATTTAAA 54 60 

AGTCAAAATT GGATGAGCTT CCATAAATAA 5520 

ATTCATATCA TTAAGTTGAT TAACGATACT 5580 
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TTGAGTAACC ATTAATAGCC ACCCTCCGTT AGTTTGAAAA TTTTATTTAA GTGTAACTTA 570Q 

TTTTACGGCA TTATAAAAGA AATAAAGACG CAAAGTCGTT ACATTTATAG CAATTTTAAT 5760 

CTATAGATGA ATTGATACAA AATAAAACGT TATTTTATAA AGCAATTTAT TGTTCTATGT 5320 

TTTATTTGTA TATTTAAAAT TATCCAGTAT ACAATTATAG CATATTTTTG GAAACAATTA 5830 

TGATATTATA CCATGTTACA AGATGGTTTT AATAATTTAA GATGAGCCAT AATTGTAAAA 5940 

CTAATTCATA ATACCGTATG TTTTATTTTT AATAGTAGAA ATTAGAAAAT GCTGATTAGT 6000 

AGGATATAAC AGTGAAATTA TAAATTTATT AACATCAACA AAACGTGTAT AATAAACATA 6060 

TTG7AGAAAA AGGAGCGGTT CAGTTTGGAT GCAAGTACGT TGTTTAAGAA AGTAAAAGTA 6120 

AAGCGTGTAT TGGGTTCTTT AGAACAACAA ATAGATGATA TCACTACTGA TTCACGTACA 6180 

GCGAGAGAAG GTAGCATTTT TGTCGCTTCA GTTGGATATA CTGTAGACAG TCATAAGTTC 6240 

TGTCAAAATG TAGCTGATCA AGGGTGTAAG TTGGTAGTGG TCAATAAAGA ACAATCATTA 6300 

CCAGCTAACG TAACACAAGT GGTTGTGCCG GACACATTAA GAGTAGCTAG TATTCTAGCA 63 60 

CACACA77AT ATGATTATCC GAGTCATCAG TTAGTGACAT TTGGTGTAaC GGGTACAAAT 642 0 

25 GGTAAAACTT CTATTGCGAC GATGATTCAT TTAATTCAAA GAAAGTTACA AAAAAATAGT 64 80 

GCATATTTAG GAACTAATGG TTTCCAAATT AATGAAACAA AGACAAAAGG TGCAAATACG 654 0 

ACACCAGAAA CAGTTTCTT7 AACTAAGAAA ATTAAAGAAG CAGTTGATGC AGGCGCTGAA 6 600 

30 TCTATGACAT 7AGAAGTATC AAGCCA7GGC 77AGTATTAG GACGAC7GCG AGGCG77GAA 66 6 0 

777GACG7TG CAATATTTTC AAA777AACA CAAGACCA77 7AGA7777CA TGGCACAATG 6720 

GAAGCA7ACG GACACGCGAA G7C777A77G 77TAGTCAA7 TAGG7GAAGA 777G7CGAAA 678 0 

GAAAAG7A7G 7CG7G77AAA CAATGACGA7 TCA7777C7G AG7A777AAG AACAG7GACG 634 0 

CCTTATGAAG 7A777AG77A 7GGAA77GA7 GAGGAAGCCC AATTTATGGC 7AAAAA7A77 6 900 

CAAGAATCTT 7ACAAGG7G7 CAGCTT7GA7 777GTAACGC C7TT7 GGAAC 77ACCCAG7A 6960 

AAA7CGCC77 A7G77GG7AA G77TAATATT 7C7AA7A77A 7GGCGGCAAT GA77GCGG7G 7020 

TGGAGTAAAG G7ACA7C777 AGAAACGA7T A77AAAGC7G 77GAAAA777 AGAACC7G77 7080 

GAAGGGCGA7 TAGAAGTTTT AGATCC77CG T7ACC7A7TG AT7TAAT7AT CGA7TATGCA 7140 

CA7ACAGC7G A7GG7A7GAA CAAAT7AA7C GA7GCAG7AC AGCC7777G7 AAAGCAAAAG 7200 

TTGATATTT7 7AG77GG7A7 GGCAGGCGAA CGTGATTTAA CTAAAACGCC 7GAAA7GGGG 7260 

CGAGTTGCC7 G7CG7GCAGA 77A7G7CA77 77CACACCGG A7AA7CCGGC AAA7GA7GAC 7320 

CCGAAAA7G7 7AACGGCAGA A7TAGCCAAA GGTGCAACAC ATCAAAAC7A 7A77GAA7T7 7380 
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GTTTTAGCAT CAAAAGGAAG AGAACCATAT 
CATCGAGATG ATTTAATTGG CCTTGAAGCA 

5 GATTAATAAA AGATTTATTG ATGAAGGTAA 

AAATAACCAG ATAATCATTG CTATACCAGA 
ATTAGATGAA GAAACTTGTT TTGAAGCAAT 

10 GGAAGAGGCA GAATCGATTG CATCACAACT 

GAAAGACTAA TGAACTTAAA GCAAGAAGTT 
CATCCCGATG CAGGGAAAAC AACGTTAACT 

15 

CGTGAAGCGG GTACAGTTAA AGGGAAGAAG 
AAAGTTGAAC AAGAGCGTGG TATTTCTGTA 
GATTATAAAA TCAATATCTT AGATACACCA 

20 

AGAACATTAA TGGCAGTTGA CAGTGCTGTC 
CCACAAACAT TGAAGTTATT TAAAGTTTGT 

25 ATTAATAAAT TAGACCGAGT AGGTAAAGAA 

ACATTAAATA TTGAAACATA CCCTATGAAT 
GG CAT CATTG ATAGAAAGTC TAAAACAATT 

30 CATTTGAATG ATGATTTTGA GTTGGAAGAA 

GAACAAGCGA TTGAAGAATT AATGTTGGTT 
GCGCTGTTGA GTGGAGACTT AACACCTGTA 

35 GTACAAAATT TCTTAAATGC ATATGTTGAT 

AAAGlAGACG TTGAAGTAAG CCCGTTTGAT 
CAAGCCAACA TGGACCCTAA ACACCGTGAT 

40 GCATTTGAAC GTGGTATGGA TGTTACTTTG 
CGTTCAACGT CATTTATGGC AGACGATAAA 
ATCATTGGAC TATATGATAC TGGTAATTAT 

45 

CAAACCTACA GTTTCCAAGA TTTACCACAA 
GCTAAAAACG TCATGAAACA GAAGCATTTC 
GGTGCGATTC AATACTATAA AACATTACAC 

50 

CAGTTACAAT TTGAAGTTTT CGAACATAGA 
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CAAATCATGC 


CAGGGCATAT 


TAAGGTGCCA 


7500 


GCTTACAAAA AGTTCGGTGG 


TGGCCCTGTT 


7560 


AACTATTGAT 


GTTTATTTAT 


TCGAAGCATT 


7620 


TTGGTTTTGG 


TCATATCAGA TGGCAATGAC 


7680 


ACTCATGCAA 


TTGTTTGTTT 


TTAAAGAAGA 


7740 


AACAGATTGG 


ATAGAAACAT ATAAAAAGGA 


7800 


GAGTCTAGAA AGACTTTTGC 


GATTATTTCA 


7860 


GAAAAACTAT 


TGTACTTCAG 


TGGTGCTATT 


7920 


ACTGGTAAAT 


TTGCGACAAG 


TGACTGGATG 


7980 


ACTAGTTCAG 


TAATGCAATT 


TGATTACGAT 


8040 


GGACATGAAG 


ACTTTTCAGA 


AGATACGTAT 


8100 


ATGGTCATAG 


ACTGTGCAAA 


AGGTATTGAA 


8160 


AAAATGCGTG 


GTATTCCAAT 


CTTTACATTC 


8220 


CCATTTGAAT 


TATTAGATGA 


AATCGAAGAG 


8280 


TGGCCAATTG 


GTATGGGACA AAGTTTCTTT 


8340 


GAACCATTTA 


GAGATGAAGA 


AAATATATTA 


8400 


GATCATGCAA 


TTACAAATGA 


TAGTGATTTT 


8450 


GAAGAAGCGG 


GTGAAGCCTT 


TGATAATGAC 


8520 


TTTTTCGGTT 


CAGCTTTAGC 


TAACTTTGGT 


8580 


TTTGCGCCAA 


TGCCAAATGC 


GAGACAAACA 


8640 


GATTCATTTT 


CAGGATTTAT 


CTTTAAAATT 


8700 


AGAATTGCCT 


TTATGCGTGT 


CGTTAGTGGT 


8760 


CAACGTACTA ATAAAAAGCA AAAGATCACA 


8820 


GAAACTGTGA ATCATGCTGT AGCAGGCGAT 


8880 


CAAATTGGAG 


ATACTTTAGT 


TGGTGGAAAA 


8940 


TTTACGCCAG 


AAATTTTTAT 


GAAAGTTTCT 


9000 


CATAAAGGTA 


TTGAACAATT 


AGTACAAGAA 


9060 


ACAAACCAAA 


TTATTTTAGG 


TGCTGTTGGT 


9120 


ATGAAAAACG 


AATATAATGT 


TGATGTTGTT 


9180 
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AAGATGAACA CATCAAGATC GATTTTAGTG AAAGATAGAT ATGACGATTT AGTATTCTTA 
TTTGAAAATG AATTTGCAAC AAGATGGTTT GAAGAGAAAT TCCCTGAAAT TAAATTGTAT 
AGTTTACTTT AACAGCTCAA TTGTATAATC GAATTTGTTA CATTAAAAAT AATTGTTTCG 
TTGAAGAAAA ATAAATTGTA TATTTTAAAA GAAAAAGGTA TACTATGATG TATCAAATGA 
ATAACCTATG GCATTTTGTC AGAGGGGAGT AACTTAAGAA TCATGACCGT ATAAATGaTT 
CGACACTTTA TCGTCATTAC GArGATATCT TCCGGTAAAG TGGGCAATTT AAATTGCTTA 
GTGAGACCTT TGCTATTTAT TTAGCATAGG TCTTTTTGTT TGTACTTAAC TTATTTATTT 
AAAGGAGTTG TACATGTTAA TGGATCCAAG TTTGATCTTA CCTTATTTAT GGGTACTTGT 
CGTTTTAGTA TTTTTAGAAG GCTTATTAGC AGCAGATAAC GCGATTGTTA TGGCTGTAAT 
GGTTAAGCAC TTACCACCCG AACAACGTAA AAAAGCTTTG TTTTACGGTT TGTTAGGTGC 
ATTTGTATTT AGATTTTTAG CATTATTCTT AATTAGTATT ATCGCGAACT TTTGGTTTAT 
TCAAGCTGCA GGAGCGGTTT ACTTAATTTA TATGTCAATC AAAAATCTGT GGCAGTTCTT 
TAAACACCCA GAAATTGAAA GTCCTGAAGC TGGAGATGAT CATCATTATG ATGAATCTGG 
TGAAGAGATT AAAGCAAGTA ACAAATCATT CTGGGGAACT GTGTTGAAAA TAGAATTTGC 
AGATATCGCA TTTGCCATTG ATTCTATGCT TGCTGCTTTA gCTATTGCTG TAACACTTCC 
TAAAGTTGGT ATTCACTTTG GTGGTATGGA CTTAGGTCAG TTCGTAGTCA TGTTCCTAGG 
TGGAATGATT GGTGTTATTC TAATGCGTTA TGCAGCAACA TGGTTTGTAG AGCTATTAAA 
CAAATATCCA GGACTTGAAG GTGCAGCCTt CGCGATCGTT GGTTGGGTAG GTGTTAAATT 
AGTTGTCATG GTATTAGCGC ACCCAGACAT CGCTGTATTG CCTGAGCACT TCCCACATGG 
CGTATTATGG CAATCTATTT TCTGGACAGT ACTAATTGGA TTAGTAATTA TCGGTTGGTT 
AGGT$ CAGTT GTTAAAAATA AAAAATCGCA TAAATAATTG ATGTGAAGCG GACAATCTTA 
40 ATTTAGTTTA AGGTTGTCCT TTTTCATTTA ATTGAGTGAT TTATGAAAAA TGGATTTTGA 
AGAATGTGAA TCAAAAGATG CGATATAGTA TTAAGAAAAT GTGCCTTTTA TATTTAGCAT 
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TTTTTCAATA GAAATTATAT AGATTTTAAA GCAAATTAGG TGTTAATGTG TCATAATGAT 
AAGTGATTTT ATTGAATGGA GTGGACATTA GTGGATATTG GTAAAAAACA TGTAATTCCT 
AAAAGTCAGT nACCsaCGTA AGCGTCGTGA ATTCTTCCAC AACGAAGACA GAGAAGAAAA 
TTTAAATCAA CATCAAGATA AACAAAATAT AGATAATACA ACATCAAAAA AAGCAGATAA 
GCAAATACAT AAAGATTCAA TTGATAAGCA CGAACGTTTT AAAAATAGTT TATCATCGCA 
TTTAGAACAG AGAAACCGTG ATGTTAATGA GAATAAAGCT GAAGAAAGTA AAAGTAATCA 
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AAATTCATTA GATTCAGTGG ACCAAGATAC AGAGAAATCA AAATATTATG AGCAAAATTC 
TGAAGCGACT TTATCAACTA AATCAACCGA TAAAGTAGAA TCAACTGAAA TGAGAAAGCT 
AAGTTCAGAT AAAAACAAAG TTGGTCATGA AGAGCAACAT GTACTTTCTA AACCTTCAGA 
ACATGATAAA GAGACTAGAA TTGATTCTGA GTCTTCAAGA ACTGATTCAG ACAGCTCGAT 
GCAGACAGAG AAAATAAAAA AAGACAGTTC AGATGGAAAT AAAAGTAGTA ATCTGAAATC 
TGAAGTAATA TCAGACAAAT CAAATACAGT ACCAAAATTG TCGGAATCTG ATGATGAAGT 
AAATAATCAG AAGCCATTAA CTTTACCGGA AGAACAGAAA TTGAAAAGAC AGCAAAGTCA 
AAATGAGCAA ACAAAAACCT ATACATATGG TGATAGCGAA CAAAATGACA AGTCTAATCA 
TGAAAATGAT TTAAGTCATC ATATACCATC GATAAGTGAT GATAAAGATA ACGTCATGAG 
AGAAAATCAT ATTGTTGACG ATAATCCTGA TAATGATATC AAT ACAC CAT CATTATCAAA 
AACAGATGAC GATCGAAAAC TTGATGAAAA AATTCATGTT GAAGATAAAC ATAAACAAAA 
TGCAGACTCG TCTGAAACGG TGGGATATCA AAGTCAGTCA ACTGCATCTC ATCGTAGCAC 
TGAAAAAAGA AATATTTCTA TTAATGACCA TGATAAATTA AACGGTCAAA AAACAAATAC 
AAAGACATCG GCAAATAATA ATCAAAAAAA GGCTACATCA AAATTGAACA AAGGGCGCGC 
TACGAATAAT AATTATAGTG ACATTTTGAA AAAGTTTTGG ATGATGTATT GGCCTAAATT 
AGTTATTCTA ATGGGTATTA TTATTCTAAT TGTTATTTTG AATGCCATTT TTAATAATGT 
GAACAAAAAT GATCGCATGA ATGATAATAA TGATGCAGAT GCTCaAAAAT ATACGACAAC 
GATGAAAAAT GCCAATAACA CAGTTAAATC GGTCGTTACA GTTGAAAATG AAACATCAAA 
35 AGATTCmTCA TTACCTAAAG ATAAAGCATC TCaAGACGAA GTGGGATCAG GTGTTGTATA 
TAAAAAATCT GGAGATACGT TATATATTGT TACGAATGCA CACGTTGTCG GTGATAAAGA 



20 



25 



30 



40 



45 



50 



AAATCaAAAA ATAACTTTCT CGAATAATAA AAGTGTTGTT GGG AAAGTGC TTGGTAAAGA 

TAAATGGTCA GATTTAGCTG TTGTTAAAGC AACTTCTTCA GACAGTTCAG TGAAAGAGAT 

AGCTATTGGA GATTCAAATA ATTTAGTGTT AGGAGAGCCA ATATTAGTCG TAGGTAATCC 

ACTTGGTGTA GACTTTAAAG GCACTGTGAC AGAAGGTATT ATTTCAGGTC TGAACAGAAA 

TGTTCCTATT GATTTCGATA AAGATAATAA ATATGATATG TTGATGAAAG CTTTCCAAAT 

TGATGCATCA GTAAATCCAG GTAACTCGGG TGGTGCTGTC GTCAATAGAG AAGGAAAATT 

AATAGGTGTA GTTGCAGCTA AAATTAGTAT GCCAAACGTT GAAAnTATGT CATTTGCA 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i> SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 604 8 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

TGAAATnGAA TAGTACTATT GCAAGTGTAA AGAGGTTAAT TTTTGCCnCA CGCGGGACTT 60 

AAAAAGGCAA CCACTGGTTG TGACATATCC TTATTTACAT TTATAAATAT AAGGAGGAGG 120 

TAGTAGTGAA AGACTTATTG CAAGCACAGC AAAAGCTTAT ACCGGATCTC ATAGATAAAA 180 

TGTATAAACG TTTTTCTATT CTTACTACTA TCTCAAAAAA TCAGCCTGTC GGACGTCGAA 240 

15 GTTTAAGCGA ACATATGGAT ATGACTGAAC GTGTACTGCG TTCTGAAACA GATATGCTTA 300 

AGAAACAAGA TTTGATAAAA GTTAAGCCTA CCGGAATGGA AATTACAGCT GAAGGTGAGC 360 

AACTGATTTC GCAATTGAAA GGTTACTTTG ATATCTATGC AGATGATAAT CGTCTGTCAG 420 

AAGGTATTAA GAATAAATTT CAAATTAAGG AAGTTCATGT TGTTCCTGGT GATGCTGATA 4 80 

ATAGTCAATC TGTTAAAACA GAATTAGGTA GACAAGCAGG TCAATTACTT GAAGGCATAT 540 

TACAAGAAGA CGCGATAGTT GCTGTAACTG GCGGATCCAC GATGGCATGT GTTAGTGAAG 600 

CAATTCATTT ATTACCATAT AATGTATTCT TCGTACCAGC CAGAGGTGGA CTAGG CGAAA 660 

ATGTTGTCTT TCAGGCAAAC ACAATTGCAG CCAGTATGGc aCAACAAGCT GGCGGTTATT 720 

ATACGACGAT GTATGTACCT GATAATGTCA GTGAAaCAAC ATATAATACA TTGTTGTTAG 7 80 

AGCCATCAGT CATAAACACT TTAGACAAAA TTAAACAAGC AAACGTTATA TTACACGGCA 84 0 

TTGGTGATGC GCTGAAGATG GCGCATCGAC GTCAATCACC TGAAAAGGTC ATTGAACAAC 900 

35 TTCAACATCA TCAAGCTGTC GGAGAGG CAT TTGGTTATTA TTTTGATACA CAAGGTCAAA 960 

TTGTCCATAA GGTTAAAACA ATTGGACTTC AATTAG AAG A CCTTGAATCA AAAGACTTTA 1020 

TTTTTGCAGT TGCAGGAGGC AAATCGAAAG GTGAAGCAAT TAAAGCATAC TTGACGATTG 1080 

CACCCAAGAA TACAGTGTTA ATCACTGATG AAGCCGCAGC AAAGATAATA CTTGAATAAG 1140 

AGATAAAAAG TTTAATACTT TTTAAATATC ATTTTAAAGG AGGCCATTAT AATGGCAGTA 1200 

AAAGTAGCAA TTAATGGTTT TGGTAGAATT GGTCGTTTAG CATTCAGAAG AATTCAAGAA 1260 

GTAGAAGGTC TTGAAGTTGT AGCAGTAAAC GACTTAACAG ATGACGACAT GTTAGCGCAT 1320 

TTATTAAAAT ATGACACTAT GCAAGGTCGT TTCACAGGTG AAGTAGAGGT AGTTGATGGT 1380 

GGTTTCCGCG TAAATGGTAA AGAAGTTAAA TCATTCAGTG AACCAGATGC AAGCAAATTA 1440 

CCTTGGAAAG ACTTAAATAT CGATGTAGTA TTAGAATGTA CTGGTTTCTA CACTGATAAA 1500 

GATAAAGCAC AAGCTCATAT TGAAGCAGGC GCTAAAAAAG TATTAATCTC AGCACCAGCT 1560 
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ACAGTTGTTT CAGGTGCTTC ATGTACTACA AACTCATTAG CACCAGTTGC TAAAGTTTTA 1680 

AACGATGACT TTGGTTTAGT TGAAGGTTTA ATGACTACAA TTCACGCTTA CACAGGTGAT 1740 

5 

CAAAATACAC AAGACGCACC TCACAGAAAA GGTGACAAAC GTCGTGCTCG TGCAGCGGCA 1800 

GAAAACATCA TCCCTAACTC AACAGGTGCT GCTAAAGCTA TCGGTAAAGT TATTCCTGAA 1860 

ATCGATGGTA AATTAGATGG TGGTGCACAA CGTGTTCCTG TAGCTACAGG TTCATTAACT 1920 

10 

GAATTAACAG TAGTATTAGA AAAACAAGAC GTAACAGTTG AACAAGTTAA CGAAGCTATG 1980 

AAAAATGCTT CAAACGAATC ATTCGGTtAC ACTGAAGACG AAATCGTTTC TTCAGACGTT 2040 

15 GTAGGTATGA CTTACGGTTC ATTATTCGAC GCTACACAAA CTCGTGTAAT GTCAGTTGGC 2100 

GACCGTCAAT TAGTTAAAGT TGCAGCTTGG TATGATAACG AAATGTCATA TACTGCACAA 2160 

TTAGTTCGTA CATT AG CAT A CTT AG CTGAA CTTTCTAAAT AATTTTAGTA TAGTTTTTAT 2220 

20 TCAAATACGC TAGTGCTCAG AACTATTTAG CATTAATTAA AGCTTATGAG TAAGCGGGGA 2 280 

GCACAAACGC TTCTCCGCTT ATTTTTATAT AAAATTTCCT AATTACAAGG AGGAAACACC 234 0 

ATGGCTAAAA AAATTGTTTC TGATTTAGAT CTTAAAGGTA AAACAGTCCT AGTACGTGCT 24 00 

25 

GATTTTAACG TACCTTTAAA AGACGGTGAA ATTACTAATG ACAACCGTAT CGTTCAAGCT 2 4 60 

TTACCTACAA TTCAATACAT CATCGAACAA GGTGGTAAAA TCGTACTATT TTCACATTTA 2 520 

GGTAAAGTGA AAGAAGAAAG TGATAAAGCA AAAT7AACTT TACGTCCAGT TGCTGAAGAC 258 0 

30 

TTATCTAAGA AATTAGATAA AGAAGTTGTT TTCGTACCAG AAACACGCGG CGAAAAACTT 2640 

GAAGCTGCTA TTAAAGACCT TAAAGAAGGC GACGTATTAT TAGTTGAAAA TACACGTTAT 2700 

35 GAAGATTTAG ACGGTAAAAA AGAATCTAAA AATGATCCAG AATTAGGTAA ATACTGGGCA 2760 

TCTTTAGGTG ATGTGTTTGT AAATGATGCT TTTGGTACTG CGCATCGTGA GCATGCATCT 2820 

AATGTTGGTA TTTCTACACA TTTAGAAACT GCAGCTGGAT TCTTAATGGA TAAAGAAATT 2 880 

40 AAGTTTATTG GCGGCGTAGT TAACGATCCA CATAAACCAG TTGTTGCTAT TTTAGGTGGA 2940 

GCAAAAGTAT CTGACAAAAT TAATGTCATC AAAAACTTAG TTAACATAGC TGATAAAATT 3000 

ATCATCGGCG GAGGTATGGC TTATACTTTC TTAAAAGCGC AAGGTAAAGA AATTGGTATT 3060 

45 

TCATTATTAG AAGAAGATAA AATCGACTTC GCAAAAGATT TATTAGAAAA ACATGGTGAT 3120 

AAAATTGTAT TACCAGTAGA CACTAAAGTT GCTAAAGAAT TTTCTAATGA TGCCAAAATC 3180 

ACTGTAGTAC CATCTGATTC AATTCCAGCA GACCAAGAAG GTATGGATAT TGGACCAAAC 3240 

SO 

ACTGTAAAAT TATTTGCAGA TGAATTAGAA GGTGCGCACA CTGTTGTATG GAATGGACCT 3300 

ATGGGTGTAT TCGAGTTCAG TAACTTTGCA CAAGGTACAA TTGGTGTATG TAAAGCAATT 3360 
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TCTTTAGGTT TTGAAAATGA CTTCACTCAT 
TACCTAGAAG GTAAAGAAT7 GCCTGGTATC 
5 AGTTTAAAGT GATGTGGCAT GTTTGTTTAA 

CATCGTGTTT CATCACTTTT CAAAAATATT 
ACCAATTATA GCTGGTAACT GGAAAATGAA 

10 

AATACATTAC CAACACTACC AGATTCAAAA 
ATTCAATTAG ATGCATTAAC TACTGCAGTT 
GGTGCTCAAA ATACGTATTT CGAAGATAAT 

15 

GCATTAGCAG ATTTAGGCGT TAAA7ACGTT 
TTCCACGAAA CAGATGAAGA AATTAACAAA 

20 ACTCCAATTA TATGTGTTGG TGAAACAGAC 

GTTGTAGGTG AGCAAGTTAA GAAAGCTGTT 
GTTGTAATTG CTTATGAACC AATCTGGGCA 

25 GATGCAAATG AAATGTGTGC ATTTGTACGT 

* 

GTATCAGAAG CAACTCGTAT TCAATATGGT 
TACATGGCAC AAACTGATAT TGATGGGGCA 

30 

GATTTCGTAC AATTGTTAGA AGGTGCAAAA 
TTATTTTAGA TGGTTTTGCG AACCGCGAAA 
ACAAGCCTAA TTTTGATCGT TATTACAACA 

35 

GCTTAGATGT TGGACTACCT GAAGgACAAA 
TCGGTGCAGG ACGTATCGTT TATCAAAGTT 

40 GTGATTTCTT TGAAAATGAT GTTTTAAATA 

CAGCGTTACA CATCTTTGGT TTATTGTCTG 
TAT7TGCTTT GTTAGAACTT GCTAAAAAAC 

45 TTTTAGATGG CCGTGACGTA GATCAAAAAT 

CTAAATTCAA TGAATTAGGC ATTGGTCAAT 
TGGATCGTGA CAAACGTTGG GAACGTGAAG 

50 

ATGCCCCAAC TTATGCAACT GCCAAAGAAG 
CTGACGAATT CGTAGTACCA TTCATCGTTG 

55 



ATTTCAACTG GTGGCGGCGC GTCATTAGAG 34 80 

AAAGCAATCA ATAATAAATA ATAAAGTGAT 354 0 

CATTGTTACG GGAAAACAGT CACAAGATGA 3600 

TACAAAACAA GGAGTGTCTT TAATGAGAAC 3660 

♦ 

CAAAACAGTA CAAGAAGCAA AAGatTCGTC 3720 

GAAGTAGAAT CAGTAATTTG TGCACCAGCA 3780 

AAAGAAGGAA AAGCACAAGG TTTAGAAATC 3840 

GGTGCGTTCA CAGGTGAAAC GTCTCCAGTT 3 900 

GTTATCGGTC ATTCTGAACG TCGTGAATTA 3 960 

AAAGCGCACG CTATTTTCAA ACATGGAATG 4020 

GAAGAGCGTG AAAGTGGTAA AG CTAACG AT 4 0 80 

GCAGGTTTAT CTGAAGATCA ACTTAAATCA 414 0 

ATCGGAACTG GTAAATCATC AACATCTGAA 4 20 0 

CAAACTATTG CTGACTTATC AAGCAAAGAA 4260 

GGTAGTGTTA AACCTAACAA CATTAAAGAA 4320 

TTAGTAGGTG GCGCATCACT TAAAGTTGAA 43 3 0 

TAATCATGGC TAAGAAACCa ACTGCGTTAA 4 44 0 

GCGAACATGG TAATGCGGTA AAATTAGCAA 4 500 

AATATCCAAC GACTCAAATC GAAGCGAGTG 4560 

TGGGTAACTC AGAAGTTGGT CATATGAATA 4 620 

TAACTCGAAT CAATAAATCA ATTGAAGACG 4 680 

ATGCAATTGC ACACGTGAAT TCACATGATT 4740 

ACGGTGGTGT ACACAGTCAT TACAAACATT 4800 

AAGGTGTTGA AAAAGTTTAC GTACACGCAT 4860 

CCGCTTTGAA ATACATCGAA GAGACTGAAG 4 920 

TTGCATCTGT GTCTGGTCGT TATTATGCAA 4 980 

AAAAAGCTTA CAATGCTATT CGTAATTTTG 5040 

GTGTAGAAGC AAGCTATAAT GAGGGCTTAA 5100 

AGAATCAAAA TGACGGTGTT AATGATGGAG 5160 
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CGAACAGAGC ATTCGAAGGC TTTAAAGTTG AACAAGTTAA AGACTTATTC TATGCAACAT 
TCACTAAGTA TAATGACAAT ATCGATGCGG CTATCGTCTT CGAAAAAGTT GATTTAAATA 
ATACAATTGG TGAAATTGCA CAAAATAACA ATTTAACTCA ATTACGTATT GCAGAAACTG 
AAAAATACCC TCACGTTACT TACTTTATGA GTGGTGGACG TAACGAGGAA TTTAAAGGTG 
AACGCCGTCG TTTAATTGAT TCACCTAAAG TTGCAACGTA TGACTTGAAA CCAGAAATGA 
GTGCTTATGA AGTTAAAGAT GCATTATTAG AAGAGTTAAA TAAAGGTGAC TTGGACTTAA 
TTATTTTAAA CTTTGCTAAC CCTGATATGG TTGGACATAG TGGTATGCTT GAGCCGACAA 
TCAAAGCAAT CGAAGCGGTT GATGAATGTT TAGGAGAAGT GGTTGATAAG ATTTTAGACA 
TGGACGGTTA TGCAATTATT ACTGCTGACC ATGGTAACTC TGATCAAGTA TTGACGGaTG 
ATGATCAACC AATGACTACG CAwACAACGA ACCCAGTACC AGTGATTGTA ACAAAAGAAG 
GCGTTACACT TAGAGAAACT GGTCGCTTAG GTGACTTAGC ACCTACATTA TTAGATTTAT 
TAAATGTAGA ACAACCTGAA GATATGACAG GTGAaTCTTT AATTAAACAC TAATATTGTA 
AAAGATGTTA AGTAAACGCT TAATGACACT TATTTTTTGA AAATAATAGT AATATCnTTT 
TGTTAAATGA AAGAATAAAG CTATAATAAT TATAGAATAA CTATTTAn 
(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5602 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 129: 
AAAGAAGTGC AAGATATCAT CGCATTAATT AAGTCGTTAC AAAgTGTAAT TGTAGACaTC 
GCTTCCAATA ATGTTGATAC AATTATGCCT GGTTATACTC ATTTACAGCG TGCACAGCCA 
ATTTCATTTG CACATCATAT TATGACTTAT TTTTGGATGT TACAACGAGA CCAACAACGA 
TTTGAAGATA GTTTAAAACG AATCGATATT AATCCTTTAG GTGCAGCAGC CTTAAGTGGT 
ACCACATACC CTATCGATAG ACACGAGACA ACAGCATTGT TGAACTTTGG CAGTCTCTAT 
GAGAATAGCC TAGATGCTGT TAGTGACAGA GACTATATTA TTGAAACATT GCATAATATT 
TCTTTAACGA TGGTTCACTT ATCACGCTTT GCAGAGGAAA TTATTTTCTG GTCCACAGAC 
GAAGCTAAAT TCATTACATT ATCAGATGCA TTTTCAACTG GCTCATCTAT TATGCCACAA 
AAGAAAAATC CTGATATGGC AGAATTAATT AGAGGTAAAG TTGGTCGAAC GACTGGTCAT 
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GAAGATAAAG AAGGTTTATT CGATGCTGTC CATACAATTA AAGGTTCTTT ACGTATTTTC 660 

GAAGGTATGA TTCAAACGAT GACAATTAAT AAAGAACGAC TCAATCAAAC TGTTAAAGAA 720 

GATTTTTCAA ATGCAACGGA ACTAGCAGAT TATTTAGTAA CTAAAAATAT TCCATTTAGA 780 

ACTGCACATG AAATTGTAGG AAAAATCGTC TTAGAATGTA TACAACAAGG TCATTATTTA 840 

TTAGATGTTC CTTTAGCAAC ATATCAACAA CATCATTCTA GTATTGATGC CGATATTTAC 900 

GATTATTTGC AGCCTGAAAA TTGTTTAAAA CGACGTCAAA GTTACGGTTC AACAGGTCAA 960 

TCATCGGTCA AACAACAACT TGATGTTGCT AAACAATTAC TATCACAATA AATACGTTAA 1020 

75 TCTACCTACC CACAATGTCT ATTAAAATTA CATTGTGGGT ATTTTAATGC TCTCTTCGTC 1080 

TTGTTGAACA TCACATTTTT AAGATTCCTA AAATGTTTGA TAATTCTTTT AAATTTATAT 114 0 

TACAAAAATG TTATAAATTG TAAAAGAAAT GTGTAAAGCG TTTTCACAAG CAGGTTTTTG 12 00 

20 TAGTATTTTA AAATTGTTAG ACTACAAATA AAGAGATGAA AGGATAAAGA CTATGACTAA 12 SO 

CTCTTCGAAA AGCTTCACTA AATTTATGGC TGCTTCTGCT GTTTTTACTA TGGGATTTTT 1320 

ATCAGTACCT ACTGCTGGCG CTGAACAAAC AAATCAAATT GCAAATAAAC CTCAGGCTAT 1380 

TCAATGGCAT ACAAATTTAA CGAATGAGCG ATTCACTACT ATCGCACATC GTGGCGCAAG 1440 

TGGCTATGCA CCCGAGCATA CGTTTCAAGC ATATGATAAG AGTCATAATG AGTTAAAAGC 1500 

ATCTTATATC GAAATTGATT TACAACGTAC CAAAGATGGC CATTTAGTTG CTATGCATGA 1560 

TGAAACTGTT AACCGTACAA CAAATGGACA CGGTAAAGTT GAGGATTATA CCCTTGATGA 1620 

ATTAAAACAG TTAGATGCAG GAAGTTGGTT TAATAAAAAA TATCCAAAAT ACGCAAGAGC 1680 

35 AAGTTATAAA AATGCTAAAG TACCCACTTT AGATGAAATT TTAGAACGTT ATGGCCCGAA 1740 

TGCAAACTAT TATATTGAAA CAAAGTCACC TGATGTATAC CCAGGAATGG AAGAACAATT 1800 

ATTAGCTTCA TTGAAAAAGC ATCACCTTTT AAATAACAAT AAATTAAAAA ATGGACATGT 1860 

40 AATGATTCAA TCATTTTCTG ACGAAAGTTT AAAGAAAATT CATCGTCAAA ATAAGCATGT 1920 

GCCATTAGTA AAATTAGTTG ATAAAGGTGA ACTACAACAA TTTAACGACC AACGCTTAAA 1980 

AGAGATACGC TCTTATGCGA TTGGATTAGG TCCTGATTAT ACAGATTTAA CTGAACAAAA 2040 

i 

TACCCATCAT TTAAAAGACT TAGGATTTAT AGTACATCCT TATACAGTGA ATGAAAAAGC 2100 

TGATATGTTA CGATTAAATA AATATGGCGT TGATGGTGTC TTTACAAATT TCGCTGATAA 2160 

ATATAAAGAA GTCATTAAGT AGTAATGTTA AACTAGAAAA CATAAATACA AAAATATAGC 2220 

TATTACTATA AAAAACAGCA GTAAGATATT TCCAAATTGA AATTATCCTA CTGCTGTCTT 2280 

TTTGGGAGTG GGACAGAAAT GATATTTTCG CAAAATTTAT TTCGTCGTCC CACCCCAACT 2340 
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TTGTCTGTAG AAATTGAGGA GCTAATTTCT CTGTGTCGGG GCTCCACCCC AACTTGCACA 24 60 

CTATTGTAAG CTGACTTTCC GCCAGCCTCT GTGTTGGGGC CCCGCCAACT TGCACACTAT 2520 

TGTAAGCTGA CTTTCCACCA GCCTCTGTGT TGGGGCCCCG ACTATTTTTG AAAAGAGCGT 2580 

GTTACACGGG CATTGTTTTA CAGTCAACTA CTGCTAAAAT AAAATTAACG AGCTTAGGGC 2640 

TTTGTTTTCT GTCCCAAGCT CGTTAAATCA CATATGATAA TTAATTATGC CCAACCACGA 2700 

TATCTAGCTG CTTCTGCTGT ACGTTTAATA CCTATGATAT ATGCTGCAAG TCTCATATCT 2760 

ATTTTTCGGT TTTGAGACAA TTCGTAAATC GTATCAAATG CCGCTTCTAA TTTTTCACGT 2820 

15 AGCTTTTCAT TAACTTCTTC TTCAGACCAA TAATAACCTT GATTATTTTG TACCCATTCG 2 880 

AAGTAAGAAA CCGTtACACC ACCAGCACTT GCTAATACGT CTGGAACTAA TAATATACCA 294 0 

CGTTCAGTTA AAATACGTGT TGCTTCTGGT GTTGTAGGTC CATTAGCAGC TTCAACAACG 3000 

20 ATACTAGCTT TAATATCATG TGCATTGTCT TCTGTAATT7 GGTTTGAAAT AGCCGCTGGT 30 6 0 

ACTAAAATGT CACAATCTAA TTCAAACAAT TCTTTATTTG AGATTGTTTC TTCAAATAAA 3120 

TTTGTTACCG TACCAAAACT ATCACGACGG TCTAATAAAT AATCTATATC TAAGCCATTT 3180 

GGAT.CGTGTA ATGCACCGTA AGCATCAGAG ATACCTACAA TTTTTGCACC TAAATCATAT 3 24 0 

AAGAATTTAG CTAAGAAACT TCCGGCATTA CCGAAACCTT GAATAACAAC CTTGGCACCT 3300 

TCAATTTGCA TATTACGACG TTTTGCAGCT TGTTCAATTG CAATAACTAC ACCTAGTGCA 3 360 

GTTGATCTGT CGCGTCCATG AGAACCACCC AATACAATTG GTTTACCTGT GATGAAACCT 34 20 

GGTGAATTAA ATTTATCTAA TGCACTATAT TCATCCATCA TCCAAGCCAT AATTTGTGAG 34 80 

TTTGTAAATA CATCTGGTGC TGGAATATCT TTGTTCGGAC CTACGAATTG TGAAATTGCT 3540 

CTTACATATC CGCGTGATAA ACGTTCAACT TCATGAATGC TCATTTGACG TGGATCACAA 360 0 

ACGATACCAC CCTTACCACC ACCGTATGGT AAGTTTACAA TGCCACATTT CAAAGTCATC 3 5 60 

40 CACATTGATA ATGCTTTTAC TTCTTCTTCA TCAACATCTG GGTGGAAACG CACGCCCCCT 3 720 

TTTGTTGGTC CAACAGCATC ATTATGTTGC GCACGGTAAC CTGTGAATGT TTTTACTGTG 3780 

CCATCATCCA TTCGTACAGG GATACGCACT TGTAACATTC TTAAAGGTTC TTTAATTAAA 3840 

TCGTACATTC CTtCGTCAAA TCCCAATTTA TGCAATGCTT CTTTAATAAT TCCTTGAGTA 3900 

GAAGTTACTA AATTATTGTT CTCAGTCATG ATCCTTTTCG CCTCTTCTTT ACCTAATGAT 3960 

TTCGCTTTCA AACATATTGT AACATAACGT ATTCCTTTTT AAAGCCCTTA CAAACTGATT 4020 

GTTACAACTT TTTGACATTA TTGAAATACA TGTCTTATTT TTTCAAGTGC AAGGTCCAAT 4080 

TCTTCTTTAG TAATAATTAA TGGTGGTGCA AAACGAATGA CAGTATCATG CGTTTCTTTA 4140 
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acacctataa acaaaccacg tccacggact tctttaattg atggatgatc aatttgcttt 
aattgttctt taaaataatc tcctaattct aaagagcggc ctggtaaatc ctcatcaacg 
ataacatcta atgcagcaat tgatgcagca caagcaagtg gattaccacc aaatgttgaa 
ccatgtgagc caggtgtaaa gacatctaat acttctttat ctgctaatac aacagaaatt 
gggaagactc caccacctag tgctttacct aaaatataga catcaggttt tacattatcc 
caatccgtag caaataattt acccgaacga cctaatcctg cttggatttc gtcagcaata 
aataagacat tatgttcatc acataattct ctaattgctt tcaaatatcc ttctggcggt 
atatttatac ccgcttcacc ttgaattggt tctactaaaa ctgctgcagt attttcatta 
attgcagctt tcaatgcatc tacatctcca aaatcaactt ttctaaatcc atctaataac 
ggaccataac cacgttggta ttctgcttct gaagataatg aaactggcgc cattgttcga 
ccatggaagt taccattaaa tgcaatgatt tctgctttat ttggctcaat tcctttaaca 
tcgtatgccc agcgtcgtgc tgctttcaaa gctgtttcta ctgcttcagc acctgtattc 
attggtaaag ctttatcttt acctgccagt ttacaaattt tttcgtacca ttcacctaag 
ttatcactat gaaaagcacg tgaaactaaa gtcactttat cagcttgatc ttttaatgct 
tgaataattt tcggatgtct atgaccttgg ttaacagcgg aatatgcaga taacatatcc 
atatatttat tgccttcagg atctttaacc catacccctt cagcttctga aatgacaatt 
ggcaatggta aataattatg tgctccgtaa tgatttgtta actcaataat tttttcagat 
ttagtcatca tatctcccct tttcatcatt tataactatt atacatgaaa cattatccaa 
ataattacat tagttttcaa agcagatact tttccaccaa aaaagatgaa ataatcacta 
agtttcatta aatttgtcta ttttgaaaac ccttacattt ataatgacat aattacttaa 
atgattacaa gcaaaagaat tgataatttt acacttaatc aaaagtatat tttactaaga 
ATATTTTTAT ttataaatat tgaaaaccac taacaaattg catacacaat atcattagtg 
gtaacagtta aacacttatt tatctttacg gggtaatggg ttaaaaccct tncattaaaa 
ttggatgncc ataaaattag gg 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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TAACCCCATT TTACCTGGAA AAATCgTTTG CGATGCaATm GCaTTtGaAT ATAaATACAT 
TTTACGTATa GAATTATAAA AgGTTTCATT CaAATCTTAG GGTCAAAAAT GTTATAATAT 
TTTTATGTCA AATTTAAAAC AGTAACACTT ATTTACAAGG TTGCAATATT TTGAAGTAAT 
AAAGGAAGTG TCGCGTATTT TAACTTTTTC AGAGCAAAAT GCACTCGCGA AAATAGATGA 
TTTAATGAAT ACTTATTGCA ATCAATGTCC AATCAAAACT CGTCTGCGTA AATTAGAGGG 
GAAAACGAAG GCGCATCATT TTTGTATCAA TGAGTGTTCA ATAGGGAAAG AAATAAAACA 
ATTAGGAAAT GAACTTCAAT AGGAGGAAGT CAAATGAAAA TTATATCTAT ATCAGAAACA 
CCGAACCACA ACACAATGAA GATTACACTT AGTGAAAGCA GAGAAGGTAT GACATCAGAT 
ACGTATACTA AAGTTGATGA TTCACAGCCA GCATTTATTA ATGACATCTT AAAGGTTGAA 
GGCGTTAAAT CAATTTTCCA TGTTATGGAC TTTATTTCAG TAGATAAAGA AAATGACGCA 
20 AATTGGGAAA CAGTATTGCC AAAAGTAGAG GCTGTATTCG AATAAATTTT TCATCAACTA 

GTATTCGGGG GGAATAAAGT ATATGGAAAT TTTACGTATA GAGCCAACAC CAAGTCCAAA 
TACAATGAAA GTTGTTTTGT CATATACAAC AGAAGACAAG TTATCTAATA CTTATAAAAA 
AGTAGAAGAA ACACAACCAA GATTTATAAA TCAGTTGTTA TCTATAGATG GTATCACTTC 
CATTTTTCAT GTCATGAACT TCTTAGCTGT TGATAAGGCA CCAAAAGCTG ATTGGGAAGT 
CATATTACCT G AT ATTAAAG CTGCTTTTTC TGATGCGAAT AAGGTTTTAG AATCTGTAAJ* 
TGAACCTCAA ATTGACAATC ATTTTGGTGA AATTAAAGCT GAATTATTAA CTTTTAAGGG 
TATACCGTAT CAAATTAAGC TAACTTCTGC TGACCAAGAA TTAAGAGAAC AATTACCACA 
AACATATGTT GACCATATGA CTCAAGCGCA AACAGCACAT GACAATATTG TTTTTATGCG 
TAAATGGCTA GATTTAGGAA ATCGCTATGG AAATATTCAA GAAGTAATGG ATGGTGTCCT 
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TCATGCAACT GATAATTGGA AGACTCGATT ACGAATGTTA AACCATTTTC CAAAGCCGAC 
TTTTGAAGAT ATACCGCTGC TTGATTTAGC TTTATCTGAT GAAAAAGTAC CGGTTAGACG 
TCAAGCGATT GTATTATTAG GTATGATTGA AAGTAAAGAA ATTTTACCGT ATTTATATAA 
GGGGCTTCGT GATAAAAGTC CTGCTGTAAG AAGAACAGCA GGGGATTGCA TAAGCGATTT 



AAAAGCCCAT ATTAATGACA ATGCGTTTGA AGTTAAATTA CAAATTGAAA TGGCCATATC 



ATAaATACAT 


60 


GTTATAATAT 


120 


TTGAAGTAAT 


190 


AAATAGATGA 


240 


AATTAGAGGG 


300 


AAATAAAACA 


350 


ATCAGAAACA 


420 


GACATCAGAT 


480 


AAAGGTTGAA . 


540 


AAATGACGCA 


600 


TCATCAACTA 


660 


CAAGTCCAAA 


720 


CTTATAAAAA 


780 


GTATCACTTC 


840 


ATTGGGAAGT 


500 


AATCTGTAAA 


950 


CTTTTAAGGG 


1020 


AATTACCACA 


1080 


TTTTTATGCG 


1140 


ATGGTGTCCT 


1200 


AACATGCTTT 


1260 


TGGATGAATA 


1320 


CAAAGCCGAC 


1380 


CGGTTAGACG 


1440 


ATTTATATAA 


1500 


TAAGCGATTT 


1560 


AGAAAATCGT 


1620 


TTCCCGCACT 


1680 


TGGCCATATC 


1740 
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AATTTAATTG GAGGAATTAA ATATGAATGC ATATGATGCT TATATGAAAG 


AAATTGCGCA 


1860 




ACAAATGCGT 


GGCGAATTAA 


CTCAAAATGG 


TTTTACAAGT TTAGAAACGA 


GCGAACAGct 


1920 


5 


ATCGGAGTAT 


ATGAACCAAG 


TAAATGCTGA TGACACTACT TTTGTAGTTA 


TTAACTCTAC 


1980 




ATGCGGCTGT 


GCAGCTGGAT 


TAGCAAGACC 


AGCTGCAGTA 


GCAGTTGCAA 


CACAAAATGA 


2040 


10 


ACATAGACCT 


ACAAATACAG 


TTACAGTTTT 


TGCTGGGCAA 


GATAAAGAAG 


CAACTGCTAC 


2100 


AATGCGAGAA 


TTCATTCAGC 


AAGCACCATC 


TAGTCCTTCG 


TATGCTTTAT 


TCAAAGGTCA 


2160 




AGATTTAGTT 


TATrrrATGc 


CTAGAGAATT 


TATCGAAGGT 


AGAGATATTA 


ATGACATTGC 


2220 


15 


AATGGACTTA AAGGATGCCT 


TTGACGAAAA 


TTGTAAATAG 


TACACATAAA 


TAAATATAAA 


2280 




GGTTAACACA 


TT1TATAATA 


TTAAAAATGG 


TGTCTGTCAT 


TGAAAATAGA 


GAATATAGTT 


2340 




GTATTCTATT 


TGTTAAATAA AGTCCGTTTT 


TACCaACTAT 


ATTTTCTAGA 


AATTTAACTG 


2400 


20 


TTTTAATAGG 


ACATCAAACA 


TAATATTCaA 


ATCaTGTGTT 


AACCTCTTTT 


TTAAAATTTT 


2460 




TTAGCATTAA 


ACTTATAGAT 


TTGGGTAAAC 


AATTACCAAT 


TGGAAACATA 


TATCACGTTA 


2520 




CGATGGGGTA 


GGTACTTAAT 


CAGCATTTTA 


TA=LATAAAGT 


AACGGAATTC 


ATGATATTAA 


2580 


25 


TATCATATTC 


CTAAAATGAG 


TGATAACAAA 


ATGCTACATA 


AAGTTAAGTT 


ATATCAAACT 


2640 




AAATATACAT 


ACTATAAATA 


ATGAAAATGA 


GGTGTTATCG 


CATATGTTGA 


ATTCATTTGA 


2700 


30 


TGCAGCATAT 


CACAGTCTTT 


GTGAAGAAGT 


TTTAGAAA7A 


GGAAATACAC 


GAAATGATCG 


2760 


CACAAATACA 


GGTACGATTT 


CGAAATTTGG 


TCATCAACTT 


CGCTTTGACT 


TATCTAAAGG 


2320 




ATTTCCACTA 


TTAACGACAA 


AGAAAGTTTC 


TTTTAAATTA 


GTAGCAACCG 


AATTATTATG 


2380 


35 


GTTCATTAAA 


GGAGATACAA 


ACATCCAATA 


CTTATTAAAA 


TATAATAATA 


ATATATGGAA 


2940 




CGAATGGGCT 


TTTGAAAATT 


ATATCAAATC 


AGACGAGTAT 


AAAGGTCCAG 


ATATGACAGA 


3000 




TTTCDGGCAT 


CGTGCATTGA 


GTGATCCTGA 


ATTTAACGAA 


CAATATAAAG 


AACAAATGAA 


3060 


40 


ACAATTTAAG 


CAACGTATTC 


TTGAAGATGA 


TACATTTGCG 


AAGCAATTCG 


GGGATTTAGG 


3120 




AAATGTTTAT 


GGTAAACAAT 


GGCGAGATTG 


GGTTGATAAA 


GATGGTAATC 


ATTTTGATCA 


3130 




ACTTAAAACA 


GTAATTGAAC 


AAATTAAGCA 


TAATCCAGAT 


TCAAGGCGAC 


ACATCGTATC 


3240 


45 


TGCATGGAAT 


CCAACAGAAA 


TTGATACAAT 


GGCACTTCCG 


CCTTGTCATA 


CCATGTTCCA 


3300 




GTTTTATGTC 


CAAGATGGTA AGTTAAGTTG 


CCAGTTATAC 


CAACGTAGCG 


CAGATATCTT 


3360 


SO 


TTTAGGTGTG CCATTTAATA TCcGCagctA CGCTTTATTG ACACACCTTA 


TTGCCAAAGA 


3420 


ATGTGGACTT 


GAAGTGGGTG 


AATTTGTGCA 


TACATTTGGA 


GATGCACATA 


TTTATTCAAA 


3480 




TCATATTGAT 


GCGATTCAAA 


CACAATTAGC 


ACGTGAAAGC 


TTCAATCCTC 


CAACATTAAA 


3540 
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TGAATCACAT 


CCAGCAATAA AAGCTCCAAT AGCAGTGTAG TCATTGCATA GTTAGCTAAC 


3650 




CATATAGACA TCAAAATGAC ATCATAGTAT TTTCAAGTGC AAAAAAGTAC TTTTTTGTGT 


3720 


5 


TAAACGTTTT 


CATAAATTAT 


GCAAAATCAT TATTTCTATC ACACTTTATG ATAAAAATTG 


3780 




TGTTAAATTA AAGATAACTT AGTAATAAAA AATGAAATGA TAGAAGAAGG AGGATAATTA 


3840 


10 


TGACTTTATC 


CATTCTAGTC 


GCACATGACT TGCAACGAGT AATTGGTTTt GAAAATCAAT 


3900 


TACCTTGGcA 


CCTACCAAAT GATTTGAAGC ATGTTAAAAA ATTATCAACA GGTCATACTT 


3960 




TAGTAATGGG 


TCGTAAGACA TTTGAATCGA TTGGTAAACC ACTACCGAAT CGTCGAAATG 


4020 


15 


TTGTACTTAC 


TTCAGATACA 


AGTTTCAACG TAGAnGGCGT TGATGTAATT CACTCTATTG 


4080 




AAGATATTTA 


CCAACTACCG 


GGCCATGTTT TCATATTTGG AGGGCAAACA TTATTTGAAG 


4140 




AAATGATTGA 


TAAAGTGGAC 


GACATGTATA TTACTGTTAT TGAAGGTAAA TTCCGTGGTG 


4200 


20 


ATACGTTCTT 


TCCACCTTAT mCATTkGAgr CTGGGAAGTT GCCTCTTCAG TTGAAGGTAA 


4260 




ACTAGATGAG 


AAAAATACAA 


TTCCACATAC CTTTCTACAT TTAATTCGTA AAAAATAAGG 


4320 




GCGAAAACGA 


CCATGACAAA 


A CAG ATT AT A GTAACAGACT CAACATCCGA TTTAT CTAAA 


4380 


25 


GAATACTTAG 


AAGCAAACAA 


CATTCATGTA ATTCCTTTAA GTTTAACTAT TGAAGGAGCT 


4440 




TCATACGTTG 


ACCAAGTAGA 


TATTACATCA GAAGAATTTA TTAATCATAT TGAAAATGAT 


4500 


30 


GAAGATGTAA 


AGACAAGTCA 


GCCAGCCATA GGTGAATTTA 7ATCTGCTTA TGAAGAACTA 


4550 


GGAAAAGATG 


GCTCTGAAAT 


CATAAGTATT CATCTTTCTT CAGGATTAAG TGGTACATAT 


4620 




AACACTGCTT 


ACCAAGCAAG 


TCAAATGGTA GATGCTAATG TAACTGTTAT TGA7TCAAAA 


4680 


35 


TCTATTTCTT 


TTGGTTTAGG 


GTATCAAATA CAACACCTAG TAGAGCTTGT AAAAgAaGGT 


4740 




GtCTCAACTT 


CTGAAATAGT 


TAAAAAGTTA AATCATTTAA GAGAAAACAT TAAATTATTT 


4800 




GTAGXTATAG 


GGCAATTGAA 


TCAATTAATT AAAGGTGGCA GAATTAGTAA AACAAAAGGT 


4850 


40 


TTGATTGGTA 


ATCTTATGAA 


AATTAAACCA ATTGGTACAC TAGATGATGG TCGCTTAGAG 


4920 




CTTGTGCmCA 


ATGCGAGAAC 


TCaAAATTCk AGTATCCAAT ACTTGAAAAA GGAAATTGCT 


4980 




GAATTTATAG GAGATCATGA AATCAAATCC ATTGGTGTCG CACATGCTAA CGTCATTGAA 


5040 


45 


TATGTTGATA AATTGAAGAA AGTTTTTAAT GAAGCTTTTC ATGTGAATAA TTACGATATA 


5100 




AATGTAACTA 


CACCAGTTAT 


TTCTGCACAT ACTGGTCAAG GTGCGATTGG CCTCGTAGTC 


5160 


SO 


CTTAAGAAGT 


AAATTTAATC 


TTTTCAGTGT TAATTACTTC CATTTCAATC CTTTATAGAC 


5220 


TAAATTTATA 


ATTAGATAGA 


TAGAGGAGGT AATTCATATG ACAAAAGAAT ATGCAACATT 


5280 




AGCAGGAGGA 


TGTTTCTGGT 


GCATGGTTAA AC CATTTACA TCATATCCAG GCATCAAGTC 


5340 
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GAATCAAACC GGCCATGTCG AAGCAGTACA AATTACGTTT GATCCAGAGG TTACTTCCTT 
TGAAAATATA TTAGACATAT ATTTCAAAAC ATTTGACCCA ACTGATGATC AAGGGCAATT 
TTTCGATAGA GGCGAAAGCT ATCAACCAGT CATTTTCTAT CATGATGAAC ATCAGAAAAA 
GGCTGCTGAG TTTAAAAAGC AACAATTAAA TGAACAAGGT ATTTTCAAGA AACCAGTGAT 
TACACCTATT AAACCATATA AAAATTTCTA TCCAGCTGAA GACTACCATC AAGATTATTA 
CAAAAAGAAC CCGGTACATT ATTACCAATA TCAACGTGGT TCAGGTAGAA AAGCGTTTAT 
AGAATCACAT TGGGGGAATC AAAATGCTTA AAAAAGATAA AAGTGAACTA ACAGATATAG 
AATATATTGT TACACAAGAn AACGGCACTG AACCACCATT TATGAATGAA TATTGGAATC 
ATTTTGCTAA AGGATTTATG TAGATAAAnT TCnGGTAAAC CTTG 
(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9230 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



GGCCGTTnAA 


AATCTCCAAA 


ATAnAAAAAC 


CCA7CTTGTT 


CCAATGTTTT 


AAAATCGCCa 


TCCaACACTT 


GaTCaATAGC 


TTGCAACAAC 


GTTGAACGTG 


TTTTaCCAAA 


AGCATCaAAC 


GCTCCCACTA 


AAATCAGTGC 


TTCAAGTAAC 


TTTCTCGTTT 


TGACTCTCTT 


CGGTATACGT 


CTAGCAAAAT 


CAAAGAAATC 


TTTAAATTTG 


CCGTTCTGAT 


AACGTTCATC 


AACAATCACT 


TTCACACTTT 


GATAACCAAC 


ACCTTTAATT 


GTACCAATTG 


ATAAATAAAT 


GCCTTCTTGG 


GAAGGTTTAT 

ft 


AAAACCAATG 


ACTTTCGTTA 


ATGTTCGGTG 


G CAAT/iT AGT 


GATACCTTGT 


* 

TTTTTTGCTT 


CTTCTATCAT 


TTGAGCAGTT 


TTCTTCTCAC 


TTCCAATAAC 


ATTACTTAAA 


ATATTTGCGT 


AAAAATAATT 


TGGATAATGG 


ACTTTTAAAA 


AGCTCATAAT 


GTATGCAATT 


TTAGAATAGC 


TGACAGCATG 


TGCTCTAGGA 


AAACCATAAT 


CAGCAAATTT 


CAGAATCAAA 


TCAAATATTT 


GCTTACTAAT 


GTCTTCGTGA 


TAACCATTTT 


GCTTTGCACC 


TTCTATAAAA 


TGTTGACGCT 


CACTTTCAAG 


AACAGCTCTA 


rrrrrriTAC 


TCATTGCTCT 


TCTTAAAATA 


TCCGCTTCAC 


CATAACTGAA 


GTTTGCAAAT 


GTGCTCGCTA 


TTTGCATAAT 


TTGCTCTTGA 


TAAATAATAA 


CACCGTAAGT 


ATTTTTTAAT 


ATAGGTTCTA 


AATGCGGATG 


TAAATATTGA 


ACTTTGCTTG 


GATCATGTCT 


TCTTGTAATG 


TAAGTTGGAA 


TTTCTTCCAT 


TGGACCTGGT 
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ACACTTCTTA 


CACCGTCAGA 


CTCTAATTGG 


AATATGCCAG 


TCGTATCTCC TTGCGACAAC 


960 




AATTCAAACA 


CTTTTTGATC 


ATCAAACGGA 


ATCTTTTCGA 


TATCAATATT AATACCTAAA 


1020 


5 


TCTTTTTTGA 


CTTGTGTTAA 


GAIT'IXjATGA ATAATCGATA AGTTTCTCAA ccctagaaaa 


1080 




TCTATTTTTA ATAACCCAAT 


ACGTTCGGCT 


TCAGTCATTG 


TCCATTGCGT TAATAATCCT 


1140 




GTATCCCCTT 


TCGTTAAAGG 


GGCATATTCA 


TATAATGGAT 


GGTCATTAAT AATAATTCCT 


1200 


10 


GCCGCATGTG 


TAGATGTATG 


TCTTGGTAAA 


CCTTCTAACT 


TTTTACAAAT ACTGAACCAG 


1260 




CGTTCATGTC 


GATGGTTT CG 


ATGTACAAAC 


TCTTTAAAAT 


CGTCAATTTG ATATGCTTCA 


1320 




TCAAGTGTAA 


TTCCTAATTT ATGTGGGATT AAACTTGAAA TTTCATTTAA TGTAACTTCA 


1380 


TCAAACCCCA TAATTCTTCC AACATCTCTA GCAACTGCTC 


TTGCAAGCAG ATGACCGAAA 


1440 




GTCACAATTC 


CAGATACATG 


TAGCTCGCCA 


TAvrrrrcrr 


GGACGTACTG AATGACCCTT 


1500 


20 


TCTCGGCGTG 


TATCTTCAAA 


GTCAATATCA 


ATATCAGGCA 


TTGTTACACG TTCTGGGTTT 


1560 




AAAAAACGTT 


CAAATAATAG 


ATTGAATTTA 


A7AGGATCAA 


TCGTTGTAAT TCCCAATAAA 


1620 




TAACTGACCA 


GTGAGCCAGC 


TGAAGAACCA 


CGACCAGGAC 


CTACCATCAC ATCATTCGTT 


1530 


25 


TTCGCATAAT 


GGATTAAATC 


ACTTACTATT 


AAGAAATAAT 


CTTCAAAACC CATATTAGTA 


' 1740 




ATAACTTTAT 


ACTCATATTT 


CAATCGCTCT 


AAATAGACGT 


CATAATTAAG TTCTAATTIT 


1300 


. 


TTCAATTGTG 


TAACTAAGAC 


ACGCCACAAA 


TATTTTTTAG 


CTGATTCATC ATTAGGTGTC 


1350 


30 


TCATATTGAG 


GAAGTAGAGA 


TTGATGATAT 


TTTAATTCTG 


CATCACACTT TTGAG CTATA 


1920 




ACATCAACCT 


GCGTTAAATA 


TTCTTGGTTA 


ATATCTAATT 


GATTAATTTC CTTTTCAGTT 


1980 




AAAAAATGTG 


CACCAAAATC 


TTCTTGATCA 


TGAATTAAGT 


CTAATTTTGT ATTGTCTCTA 


2040 


35 


ATAGCTGCTA 


ATGCAGAAAT 


CGTATCGGCA 


TCTTGACGTG 


TTTGGTAACA AACATtTTGA 


2100 




ATCCAAACAT 


GTTTTCTACC 


TTGAATCGAA 


ATACTAAGGT 


GGTCCATATA TGTGTCATTA 


2160 


40 


9 

TGGGTTTCAA 


ACACTTGTAC 


AATATCACGA 


TGTTGATCAC 


CGACTTTTTT AAAAATGATA 


2220 


ATCATATTGT 


TAGAAAATCG 


TTTTAATAAT 


TCAAACGACA 


CATGTTCTAA TGCATTCATT 


2280 




TTTATTTCCG ATGATAGTTG 


ATACAAATCT 


TTTAATCCAT 


CATTA'l'lTlT AGCTAGAACA 


2340 


45 


ACTGTTTCGA 


CTGTATTTAA 


TCCATTTGTC 


ACATATATTG 


TCATACCAAA AATCGGTTTA 


2400 




ATGTTATTTG 


CTATACATGC 


ATCATAAAAT 


TTAGGAAAAC 


CATACAATAC ATTGGTGTCA 


2460 




GTTATGGCAA GTGCATCAAC 


ATTTTCAGAC 


ACAGCAAGTC TTACgGCATC TTCTATTTTT 


2520 


SO 


AAGCTTGAAT 


TTAACAAATC 


ATAAGCCGTA 


TGAATATTTA AATATGCCAC CATGATTGAA 


2580 




TGGCCCCTTT 


CTATTAGTTA 


AGrmxsixsc 


GTAAAGCTGT 


AGCAAGTTGC TCAAATTCAT 


2640 
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CAATATCATT AATAATCAAT TGCCCTTTAG AACGTAATCG ACATCTGATT TCATTACCTT 2760 

CATCGACTGC AAATACCCAT ATTTTCAAGC CTTTGATGTC AGCAATTGTA TTAACAAACT 282 0 

GAGATGCTTC ATTTGGCTGA ATACCGAATT GCTCCAATAC ATCTTCAGTT ATTTTAACTT 2880 

GGCAGAATCC ATCATCCATA AGTTCGAAAT GTTGTAAAAC ATAACCTTGA AACGGCAACA 294 0 

TTTTTGGGTC CTTCTCCATC ATTTTATTTA AAAGCGCATT ATGATCAATA TCATGCCCAA 3000 

TTAACTTTCC AGCAATTTCC ATAGTATGTT CTGAGGTATT GTTAAAAAGG AATCGCCCAG 3 060 

TATCACCGAC GATACCAAGA TATAAAACGC TCGCGATATC TTTATTAACA ATTGCTTCAT 3120 

CATTAAAATG TGAGATTAAA TCGTAAATGA TTTCACTTGT AGATGACGCG TTCGTATTAA 3180 

CTAAATTAAT ATCACCATAC TGATCAACTG CAGGATGATG ATCTATTTTA ATAAGTTTAC 324 0 

GACCTGTACT ATAACGTTCA TCGTCAATTC GTGGAGCATT GGCAGTATCA CATACAATTA 33 0 0 

20 CAAGCGCATC TTGATATGTT TTATCATCAA TGTTATCTAA CTCTCCAATA AAACTTAATG 336 0 

ATGATTCCGC TTCACCCACT GCAAATACTT GCTTTTGCGG AAATTTCTGC TGAATATAGT 3 42 0 

AT7TTAAACC AAGTTGTGAA CCATATGCAT CAGGATCTGG TCTAACATGT CTGTGTATAA 34 3 0 

25 TAATTGTATC GTTGTCTTCG ATACATTTCA TAATTTCATT CAAAGTACTA ATCATTTTCA 354 0 

TACTCCCTTT TTTAGAAAAG TTGCTTAATT TAAGCATTAG TCTATATCAA AATATCTAAA 3 60 0 

TTATAAAAAT TGTTACTACC ATATTAAACT ATTTGCCCGT TTTAATTATT TAGATATATA 3 55 0 

TATTTTCATA CTATTTAGTT CAGGGGCCCC AACACAGAGA AATTGGACCC CTAATTTCTA 3720 

CAAACAATGC aAGTTGGGGT GGGGCCCCAA CGTTTGTGCG AAATCTATCT TATGCCTATT 37 8 0 

TTCTCTGCTA AGTTCCTATA CTTCGTCAAA CATTTGGCAT ATCACGAGAG CGCTCGCTAC 384 0 

TTTGTCGTTT TGACTATGCA TGTTCACTTC TATTTTGGCG AAGTTTCTTC CGACGTCTAG 3 90 0 

TATGCCAAAG CGCACTGTTA TATGTGATTC AATAGGTACT GTTTTAATAT ACACGATATT 3 96 0 

V 

TAAGTTCTCT ATCATGACAT TACCTTTTTT AAATTTACGC ATTTCATATT GTATTGTTTC 4 02 0 

TTCTATAATA CTTACAAATG CCGCTTTACT TACTGTTCCG TAATGATTGA TTAAAAGTGG 4080 

TGAAACTTCT ACTGTAATTC CATCTTGATT CATTGTTATA TATTTGGCGA TTTGATCGTT 414 0 

45 AATTGTTTCA CCCATCTGAG GCTGTCTTCC TAAAAGTTGC ATAGACTTTA AAACATCTTG 4200 

TCTATTAATC ACACCCACTG TCTTTTTATT ACTCGAAACG ACAGGAATCA ATTCAATACC 4260 

TTCCCAAATC ATCATATGCG CACAACTTGC TACTGTACTC ATAGCATTTA CATAAATAGG 432 0 

50 ATTTCGCGTC ATCACTTTAT CTATTTCGTC GTCGTCCTTT GTATTAATCA TCTCTCGACT 438 0 

TGTTACAATA CCTACTAATT TATACGACTC ATTGACTACC GGAAATCTTG TATGGCCAGT 4440 
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ATCTAATGGC 


GTCATTATAT 


CTTGAACTAT 


TAAGATATCT 


TTTCGTATTT 


TCTGATTAAA 


4560 




AAGTGCTTTG 


TTGATAATAT 


TTGCAACTAG GAATGTATCA TAACTTGATG ATAGAACAGG 


4620 


5 


TAAATCATGT TCATTCGCAA AATTAATAAC TTTATTAGAT GGCTTAAATC CACCAGTAAT 


4680 




TAATATAGCC GTACCTCTTT TTAAAGCTTC AATCTGCACA TCTTCACGAT TTCCGACAAT 


4740 




CAATAATGTC 


TTTGGACCAA 


TATACTTTAA AATATCTTTG 


AGTTCCATTG 


CTCCAATTGC 


4800 


10 


AAATTTAGAT 


ACCATCTTAG 


TGATACCTTT 


GTTGCCACCT 


AACACTTGGC 


CATCAATAAT 


4660 




ATTGACAATT 


TCATTAAAAG 


TTAAATGTTC 


AATTTCATTA 


CGATTACGTT 


TTTCGATTCG 


4920 


15 


AACCGTACCA ACACGATCTA 


TCGTTGCGAC 


CATGCCCATT 


TTATCAGCAT 


CTTTmATTGc 


4980 


ACGATATGCT GTCCCytCaG ATACGTTTAA AAATTTAGCG ATTTTACGCA CCGAAATTTT 


5040 




AGAGCCTATA 


GATAACGATT 


CAATATAATC 


TAAAATTTGT 


TCATGTTTTG 


TCATTCTTTA 


5100 


20 


CCTCTTCTTT 


TCGAACAGTA 


TTAACTACAT 


TATAACTTTA 


TTTTGGATAA 


AAAGCATTGA 


5160 




AGTGAAATGA 


AATAATGATC 


GTTtCACCTA 


TTTTATTTTT 


TGAAAATATA 


CAACAAACAC 


5220 




AAAGATCACA 


AAATCTTTAA 


TTTTAAATGG 


AAAAATCCAT 


TATTATTTAT 


TAGAATGTAA 


5280 


25 


GTGAGGAGGG ATGTACTAAT GTATAAAAAT ATATTACTTG GTGTAGACAC TCAGTTAAAA 


5340 




AATGAAAAAG 


CACTAAAAGA 


AGTGTCTAAA 


TTAGCTGGCG 


AAGGTACAGT 


CGTAACAGTT 


5400 




TTAAACGCAA 


TCAGCGAACA 


AGaTGCTCAA 


GCATCAATTA 


AAGCAGGTGT 


TCATT7AAAC 


5450 


30 


AAACTTACTG 


AAGAACGAAG 


CAAGCGATTG 


GAAAAAACAC 


GCAAAGCTTT 


AGAAGATTAT 


5520 




GGTATTGATT 


ATGACCAAAT 


AATTGTTCGT 


GGTAATG CAA 


AAGAAGAACT 


ATTAAAACAT 


5580 




GCTAATAGCG 


GTAAATATGA 


AATTGTTGTT 


TTAAGTAACC 


GTAAAGCAGA 


AGACAAAAAG 


5640 


35 


AAATTTGTAC 


TTGGAAGTGT 


CAGCCACAAA 


GTAGCAAAAC 


GTGCGACTAT 


CCCTGTATTA 


5700 




ATCCffTAAAT 


AAAATTTTTA 


TCCAGAATCA 


CAAATAATCT 


TTCAATCATG 


ATGCAGTCTC 


5760 




AAACGACTGA 


GTAAATACAA 


GAAACGATTA 


TGACTGTGGT 


TCTGGATTTT 


TTATATCGTA 


5920 


GTAAATTTAT 


AATCAATGTC 


TAATTGTATA 


AAACTAAAAT 


TACGAGAGTA 


GGTCAGAAAT 


5880 




GATAAAGAAC 


CACTGATGTC 


CCCCGTCCAC 


GTCGTAACTG 


AATCAGTAGA 


ATATAAAAAC 


5940 


AC 


ACCCACTAAA 


AATATGCAGA 


CGATAACTTC 


CACATAGATT 


AGCGAGGTGT 


TTTTTAGTGT 


6000 




AAAATCTATA 


TTCTATTTAA 


AACTGAACAG 


ATTCACCTGG 


mTAAAATT 


TGCACGTCCC 


6060 




CTACATTAAC 


AGCATCTTTA 


AATTGTTGTG 


GATCTTGTTC 


GATTAATGGG 


AATGTATCAT 


6120 


50 


AATGAATCGG 


TACAGAAATT 


TTTGGTTTAA 


TAAA7TCATT 


AATAGCATAA 


CTTGCATCAT 


6180 




CAATACCCAT 


CGTAAAATTA 


TCTCCAATTG 


GTACAAAACA 


TACATCAACT 


GGATGACGTT 


6240 
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TTCAACTTCA AACACGATAC CCATTGGCAT ACCTAAATAA ACTGGgAATA CCATTTTCAT 6360 

GTGTAAAACT TGAACTATGA AATGCTTGAA CAAATTTAAC GCTTCCGAAA TCAAaGTTTG 6420 

CTTTACCACC AaTATTCATA CCATGAACAT TTTCAACACC GTGATATGAA GAAAGATAGT 6480 

CAGCCATTTC TGCACTTCCA ATTACTGTTG CTCCTGTTTT CTTTGCTAGT TCCACAACAT 6540 

CACCAAAATG ATCAAAATGA CCGTGCGTTA AAACGATATA GTCTACCTGC ACTGTTTCAA 6600 

TATTCAAATC ACACTTAGGG TTATTTGAAA TAAACGGATC TACGATAACC TTTTTGTTGT 6660 

TCCCTTCTAA ATAAATCGTT GATTGACCAT GAAATGATAA CTTCATTTGA GCATCCTCCT 6720 

ATCAATTACT ATATAAATTT AGTACCCTTT TGCCACTTAA TTATAACAAA TTCTCAAATT 67 80 

TTAAAAATTG AAAATCTAGT TAATGTATTA GCTCGATTTT GAAATCTAAT AATAATTGGC 684 0 
ATAAAATGGA AGTAATATTA TGTTGAGGAG TGTTTATAAA ATGACAAAAA TATCAAAAAT . 6900 

AATAGACGAA TTGAACAATC AACAAGCTGA TGCAGCATGG ATTACAACAC CGTTGAATGT 6960 

ATATTATTTT ACTGGATACC GTAGCGAACC CCATGAAAGA TTATTTGCAT TATTGATTAA 702 0 

GAAAGATGGT AAACAAGTAC TATTTTGTCC AAAAATGGAA GTCGAAGAAG TCAAAGCATC 7030 

25 ACCTTTCACA GGTGAAATCG TTGGATATTT AGACACTGAA AACCCTTTTT CACTTTATCC 714 0 

TCAAACAATC AATAAATTAC TAATTGAAAG CGAGCACTTA ACAGTAGCAC GCCAAAAACA 72 00 

ATTAATCTCT GGTTTCAATG TCAATTCATT CGGAGATGTT GATTTAACAA TCAAACAAT7 7 2 50 

30 GAGAAATATT AAATCCGAAG ATGAAATTAG CAAAATACGT AAAGCTGCTG AGTTAGCAGA 7320 

TAAGTGTATC GAAATAGGTG TTTCTTATTT AAAAGAAGGT GTGACTGAAT GTGAAGTAGT 7330 

CAACCATATT GAGCAAACTA TCAAACAATA TGGCGTCAAT GAAATGAGTT TTGATACGAT 7440 

GGTTTTATTT GGAGATCATG CCGCATCACC TCATGGCACA CCAGGAGATC G CAG AIT AAA 7500 

AAGGAATGAA TATGTACTAT TTGATTTAGG TGTAATTTAT GAGCATTATT GTAGCGATAT 7560 

GACACGTACT ATTAAATTTG GTGAACCTAG CAAAGAAGCA CAAGAAATTT ATAATATTGT 76 20 

ATTAGAAGCA GAAACATCTG CAATCCAAGC AATTAAACCT GGAATACCAT TAAAAGATAT 7680 

CGATCATATC GCTAGAAATA TTATTTCAGA AAAAGGTTAT GGTGAATATT TCCCTCATCG 7740 

CTTAGGTCAT GGCCTAGGAT TACAAGAACA TGAATATCAA GATGTTTCAA GTACTAATTC 7800 

TAATTTGTTA GAAGCTGGCA TGGTTATTAC AATCGAACCA GGTATTTATG TACCTGGTGT 7860 

TGCAGGTGTA AGAATTGAAG ATGACATACT TGTCACTAAT GAAGGATATG AAGTATTAAC 7920 

SO ACATTACGAA AAATAAGGAG TGGGATAAAA ATGAAAAGCT TGTTACAAGC GCATTCTCAT 7980 

TCAGTCAAAC ACTGCCAATA TAACATTGTA GCGCCTAAGA CATAAATTTT TATCCAAGTC 8040 
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TGTAATGAAT CAAATCAATA TCATTCATGT TCGATGATTT CTTCGCATTG TTTCTAGCTT 8160 

TAATTTATCA TTATTTAATT TTAATAACCA AGGAGATGAT AACGTCATTC TTTAGTACGC 8220 

TGTAATCCAT TCCCTTTTCA TCAAATTCAA ATTATAATTG TAATGCTTCT TCTACAGATT 8280 

TATATTCCAT TTCAAATGCC TCTGCAACGC CTTTATTGGT TACGTGACCT TTGTAAGTAT 8340 

TTAAACCTAA TGATAATGGT TGATTTGATT TAAATGCTTC TCTATACCCT TTATTAGCTA 8400 

* 

GCATGAGCGC ATAAGGTAGC GTAgCATTAT TTAAAGCTAA CGTCGAAGTA CGCGGTACTG 8460 

CACCTGGCAT ATTTGCAACT GCATAATGAA CCACACCATG CTTAATATAT GTAGGATCAT 8520 

CATGTGTCGT AATTTTATCA GTTGtTTCAA AAATACCGCC TTGATCAATA GCAATGTCAA 8580 

TAATAACTGA CCCATTTTTC ATTTGTTTAA TCATGTCTTC TGTTACAAGT CTTGGCGCTT 864 0 

TAGCACCTGG AATTAAAACT GCACCTATTA CTAAATCACT TTGTT7AACA TACAAC7CAA 8700 

TATTCAACGG ATTTGACATA ATTGTATGTA CACGTCCACC GAATAAATCA TCTAATTGTT 8 760 

GTAAACGCTT TGGATTAACA TCTAAAATCG TAACATCTGC ACCTAGTCCT AGTGCAATTT 8 32 0 

TAGCTGCATT TGTTCCTGCT TGACCACCAC CGATAATAGT TACTTTACCC TTAGGTACTC 3 830 

25 CTGGGACACC ACCTAGTAGA ATTCCCATAC CACCATTAAG TTTTTGTAGG AACTCTGCGC 3 94 0 

CAACTTGAGC TGACATTCTT CCTGCTACCT CACTCATTGG TGATAACAAT GGTAAAGATC 9000 

GGTCTGGTAA CTGCACAGTC TCATATGCAA TACTAATTAC TTTTCTATCT AT CAAAG C TT 90 50 

30 GTGTTAATTT TTCTTCATTT GCTAAATGAa gatAaGTGAA TAATACAAGC CCTTCTTTAA 9120 

. - 

AATATGGATA TTCAGATTCA AGTGGTTCTT TAACTTTAAT AACCATATCC ACATCCCAAA 9180 

CTTTTGCTTG TTCAGCAACA ATCTCAGCAC CTGCTTCTTT GTAATCTACA TCTTCAAAGA 924 0 

ATGATCCTGA ACCCGcATTT GTTTCCACTA AAACAGTATG 928 0 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 4669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
CTGATTAATC TCTTGTTGTC GTGTATTTAC TAATTGAATC GTTGGTGTCT GAACACGTCC 60 

SO CAGGGATAGC TGTGCATCAT ACTTTGTTGT TAGTGCACGC GTTGCATTAA TCCCAACAAT 120 

CCAATCTGCC TCACTTCTCG CTAACGCTGC ATAATACAAA TCGTTATATT GACGACCGTC 180 
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ACGGATTGGC TTTTTGTTAC 


CAACTTTATC CAAAATCAAT CTTGCAACTA GTTCACCTTC 


300 




TCGTCCaGCA TCTGTTGCAA 


TAATAATATC 


TTTCACTTTA TTATCTAAAA 


TTAACGCTTT 


360 


5 


TACTGTTTTA AATTGTTTGC 


TTGTTTTACC 


AATAACAACA GTTTTCATAT ATTTAGGTAT 


420 




AATTGGAAGG TCTTCTAATC 


GCCATTCCTT 


TAAATTTTTA TCGTATTGTT 


CAGGTGTCGC 


480 




ATTTGTCACT AGATGACCTA 


ACGCCCACGT 


GACAATATAT TGGTTATTTT 


CAAAGTAACC 


540 


10 


ATTACGCTTC TGATTTATTT 


GTAAAGCATC AGCAATATCT CTTGCGACTG 


ATGGTTTTTC 


600 




AGCTAATATT AAAGATTTCA 


TAAATTATCC 


TTTCTCATAC GTTCTTTTAT 


TTCGAACGTG 


S60 


15 


CTTCATCTAT TCCACTAATC TTTGATTTAA ATTCAATGAT TGCAAATGAT 


GTGTTAAATG 


720 


TATTGTAACA TGTTAATATC 


ACTATTAACT 


TTCATTTCAG TTGAAATACT 


ATATAATAAA 


730 




AGTAACAAAA AGTACGGAGG 


TAATGACATG 


AGCATAGTTC AGTTATATGA 


TATTACACAA 


340 


20 


ATAAAATCGT TCATTGAACA 


TTCGAATTAT 


GAATCAGCAT CATACTTATA 


TAAACTTCCT 


900 




CAACAGTACA ATGAAATAGA 


TGTATTAATA 


ACCGATGCGA 7TGAATCACC 


TGGTGTATTT 


SGO 




TCGATTAAAG AAAACGATTC 
AAATTCAAAG TCATAGGCCC 


AATCAAAGCA 


ATCATATTGT CTTTTGCATA 


CGATAAAAAT 


1020 


25 


TTTCGTGGCT 


GACAATTATG TATTATCTGT 


CGATACGTTT 


1080 




GAAACGCTAT TTAAAGCAAT 


GACTTCGAAC 


CAACCTGACG ATGCCGTCTT 


TAACTTTTCT 


1140 




TTTGAAGAAG GCATTCAACA 


AT ACAAA C CA 


TTAATGAAAG TTATTCAAGC 


AAGTTATAAC 


120C 


50 


TTCACTGACT ATTACATAGA 


AGCCCGTACA AGATTAGAAG AAGATATGCA 


CCAACCAAAT 


1260 




ATCATTCCTT ATCACAAAGG 


GTTTTATCGT 


GCTTTCAGCA AATTACACAC 


AACTACATTT 


1320 


35 


AAATATCAGG CACAGTCACC 


ACAAGATATC 


ATTGATAGTT TAGACGACCA 


TCATCATTTG 


1380 


TTTITATTTG TTAGCGAAGG 


riTACTTAAA 


GGTTATTTAT ACCTTGAAAT 


TGATTCACAA 


1440 




CAGTCAATCG CCGAGATTAA 


ATACTTCAGT 


TCTCATGTAG ATTACCGTTT 


GAAAGGTATC 


1500 


40 


GCTTTCGAG7 TGCTTGCGTA 


TGCATTGCAA 


TATGCTTTTG ATAATTTTGA 


TATTAGAAAA 


1550 


GTTTATTTTA AAATTCGTAA 


TAAAAATAAT 


AAACTCATCG AACGATTTAA 


TGGTCTAGGT 


1620 




TTCCATATCA ACTATGAGTA 


CATTAAATTC 


AAATTCGAAT CACGTAACGT 


AAAAGATCAA 


1630 


45 


ACAATCCCTG AATAAAACAC 


CAAGCAAATA 


CCCTACAGTA CATCATTAGC 


ATGTATTGTG 


1740 




GGTTTTTCTA CTTTTTGTAA 


ATATTGAAAA 


TTATAAGTAG TTGTTTTTTA 


CTATTAGGGC 


1800 




AGAATGCTTT ACAATAACAT 


GCAAGTGTCA ATTAAGGGGA GCACTTGCAT 


AAATAGTATA 


1860 


50 


GGAGAGTGAG TAGTCTTGCA 


ATTTCTTGAT 


TTCTTAATCG CACTTTTACC 


TGCTTTATTC 


1920 




TGGGGAAGTG TCGTTCTTAT TAATGTGTTC GTCGGCGGTG GACCTTACAA CCAAATTCGT 


1980 
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TTCAATAATC CTACTGTAAT TATTGTCGGT CTTATTTCTG GTGCATTATG GGCGTTTGGA 2X00 

CAAGCGAATC AGCTTAAATC TATTAGTTTA ATCGGTGTAT CAAATACTAT GCCAGTTTCT 2160 
ACAGGTATGC AATTAGTTGG TACAACATTA TTCAGCGTTA TCTTTTTAGG TGAATGGTCT ' 2220 

TCAATGACTC AAATTATCTT TGGTTTAATC GCCATGATAT TATTAGTTAC TGGTGTAGCA 2280 

CTTACTTCAC TTAAAGCTAA AAATGAACGT CAATCAGATA ATCCTGAATT TAAAAAAGCA 2340 

ATGGGTATTT TAATTGTATC TACAGTTGGA TATGTAGGTT TCGTTGTACT TGGTGACATC 2400 

TTTGGTGTTG GTGGAACTGA TGCATTGTTC TTCCAATCTG TCGGTATCGC AATTGGTGGC 2460 

TTTATCCTAT CCATGAATCA TAAAACATCA CTTAAATCAA CAGCACTTAA TCTATTGcCA 2520 

GGTGTGATTT GGGGAATTGG TAACTTGTTC ATGTTCTATT CTCAACCAAA AGTTGGTGTA 2590 

GCTACAAGTT TCTCATTATC ACAGTTACTT GTTATCGTTT CAACCTTAGG CGGTATTTTC 264 0 

20 ATTTTAGGAG AAAGAAAAGA TCGTCGTCAG ATGACGGGTA TTTGGGCAGG TATTATTATT 2700 

ATCGTGATAG CTGCTATAAT TCTAGGTAAT TTGAAATAGA AAGTTAAATA CTCATGTAAC 2760 

GTAAAAATGT AATCACTTCT GAAAATAACC ATTCACTTAT A3AATGATTA AAATTAATTT 2820 

25 TCGGGAATTT TACGTTGAAT GTTCCTCTAT ATGTCCTAGG AAATACGTGG CTCTAAAAAC 2 3 80 

AAAACGCAAT AACACATCAT GACATTAATC ATGCGTTTTA AGACTTTAAA ATTAGCGATA 294 0 

CTTTTAAAAT CTTCATCATA TTCATATATC AAGTATGCGC CATACATATG AAGTGGATAG 3 000 

CTGCATAACG CACTGCATTA TCAACTTGAA XGTATGAGTT GAACAACTAT GTCATAAATA 3 060 

AAAGCCCCCT TTTCACAATA TACATTTACA TATTGTGGTA AAGGGGGCTC TCATTTTCTA 3120 

CGAATACTAA AATGGATTTT ATTTTCAAAT GTGTAAACTA GACAAACACT GCCTGATACA 318 0 

CGTACAAAAT AATGATACTA ATAATGATTG TCAAATTGGT CGTCATACCT ATAAATGGCA 3 240 

GTGTTCGATA TTTAAACTGA ATACCATAAG AAATAATTGC AACACcTACC GGGAACATCC 3300 

AAGTGACCAA CAATGTCGTC TTAATCATAT CATCTGATAC TGGTAACAAG A CAT AT ACT A 33 60 

ACAATCCCGC AACTAATGCT AATCCATAAT GCAAACATAA ATATTTAATA GTAGCAGGTA 3420 

TATACTTTCT TTCCAGAGTA AAATTCAACA TGACACCTAG CAAAATCATT GATAACGGCA 3480 

45 TATTTGCATG GGAAAGTATG CTAAAGAAAT CGATTGCCAC ATGTGGTAAA TGGATGTGAC 3540 

TTATATTCAA TATAAACATT ACAATGTATG TAACGAGTGG CACTGATTGT AATAATTTCT 3600 

TACCTAAATA TTTAAAATCG AATTGATCAC TACCTTCACT AAAGTAGCTA CCTACAAAGT 3660 

50 AAGTAATTCC AAACATCACA AAGGCACCAC CTATATCAGC CATAACAAAA TAAATAAGTC 3720 

CCGTTTTAGG CCATATCACT TCAATTAGTG GATATGCAAA CAATCCAATA TTCATAGCAC 3780 
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CAATCATTTT CGCCACAATA CCATATATAA TCATTAAAAT TGGTAAAATG GAGAATGACA 3900 

ATTTTAATTC TGCACTGTTT AAATTCACAA TAACTAAAGA TGGGAGTGTG ACATTAAGAA 3960 

CTAATGTAGC AATGACTTGA CTATCTGTTG CTTTTATAAA ATTAATGCGC TTCAAAAAGT 4020 

AACCAAGCGC AATTAATAAA ATAATCATAG TAAATTGTTC TGTCACTGTT ATCCCTTCTT 4080 

TCAATAATCT TCATAATTTA TAACTTTAAC ATACTCCACA GATATTTTAG AAGTCTACTG 4140 

TTTCATGCTA TAATCTACAT TAAATGCACT TAATTATATT TCAAAGGAGT GTTATAGTAT 4200 

GTCTTTAGAA AACCAACTAG CCGAACTTAA ATATGATTAT GTTCGTCTTC AAGGTGACAT 4260 

AGAAAAACGG GAATCTTTGA ATTTAGATAC TTCCGCACTT GTTCGTCAAC TTAAAGATAT 4320 

TGAAAATGAA ATTAGAAACG TTCGTGCTCA AATGCAAGAT TAATAATCTA TCATTCAAGC 4 380 

AATAAATGCT TTTTGTTACA TAAATTTGAC 7AGCATTGCT CTGAATACGT TATATTGA7G 44 4 0 

AATTGCTTCA TTTTTCGCTC AATTACATCT AGAATCACAA GATGTTGTCG TGTTATGATT 4500 

TAGTGT77CA TTAACAACAT ACACGCATAT CTATCCCAAC ACTGCTATTT ATGTTTTCTA 4 560 

CGCTGnTGTA CTACATGAAC CCTTTGAAAC GGAGAGGAAG TTATCATATG CAATTTTAnC 4 62 0 

25 TGATTTTACT AGCAATACTT TAACnAATTG nTAGTTTAAT AGAATTTTA 4 569 

(2) INFORMATION FOR SEQ ID NO: 13 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2735 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 



TTTGCACCCA 


TCTGaTACAA 


TGCACCATGC 


GGTTTAACAT 


GATTAATTTi' 


AACTTGATGA 


60 


ATGCGACAAA 


ACCCTTGTAA 


TGCACCTAAT 


TGATAAATCA 


T CAAATT AT A 


AATCTCGTCG 


120 


TTAGAGATAT 


CTATATTTCG 


TCTGCCAAAG 


CCTTTCAAAT 


CAGGTAAACC 


AGGATGTGCA 


180 


CCTACTGCAA 


CATTATGTGC 


TTTGGCAAGT 


TTTACCGTTT 


CATTCATTAC 


ATTTTCATCA 


240 


CCAGCGTGAA 


AACCACAAGC 


AACATTCGCA 


CTTGTAATTA 


ACGGAATAAT 


TTGATGATCA 


300 


CCACCAAAGG 


AATAATTTCC 


AAATGCTTCG 


CCTAAATCAC 


AATTCAAATC 


AACTCGCATT 


360 


ATAATTCCAC 


CCCTTTAACA 


ATTTGATGTT 


TTTCTAAAAA 


TTTAATATCA ACATCTTTTG 


420 


CATCTCCATC 


ACGATATAGT 


GGATAATTTA 


AAACTGCATA 


TAAAAAATCG 


GCAGTTGTAG 


480 


AAAATCCATC 


TATCACCATT 


TCATCTAAGG 


TGACTTTCAA 


CTTATCAATT 


GCTGAAGCTC 


540 
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AACCGTGATA TAGTAAAGAA 


TCGACTCGCA CATTAAAGCC TTGAGGTAAA TGTAACGCTG 


sso 




TCACTTTACC TGGTGTTGGT 


TGAAATTTCT 


TTTCaGGATT 


TTCGGCATTT 


ATTCTCGCTT 


720 


5 


CTATCACATG ACCATTAAAT 


TGAATATCGC 


TTTGTGAAAA 


AGGTAAATGA 


TTATGTTCCA 


780 




ATAAATACAG TTGTGCTGCA ACCAAATCAC GTTCTGCTCG 


CATCTCTGTA ACAGTATGTT 


840 


10 


CAACTTGTAT TCGAGCATTC ATTTCAATAA AGTAATGTGC GGTATCAGTT ACTAAAAATT 


900 


CAATCGTACC TGCACTTCTA 


TAATTTGCTG CACGTGCAAC TTTAACAGCA TCGTTACATA 


960 




TTTGTTGTCG TCTTTCTTCA GTTAATGCTG 


CACAAGGAGA TTCTTCGATT AATTTTTGAT 


1020 


15 


TTTTACGTTG TACAGAACAA 


TCACGTTCCC 


CTAAATGTAC 


ATAATTATCC 


TGCCCATCTC 


1080 


CCaTAACTTG AACTTCAACA 


TGTTTTGcAA 


CAGGTATAAA 


AGCCTCAACA 


TAAACACGAT 


1140 




CATCATCAAA GTATTTTTTT 


CCTTCACTTT 


TAGCTTCTTT 


AAATGCCTTT 


TCTAAATCTT 


1200 


20 


CAGCTTTCTT TACAATACGT 


ATACCTTTAC 


CACCACCGCC 


ACTGGCAGCT 


TTGATAACAA 


1260 




CTGGATAACC GATGTCTTTG 


GCAAGATTCT 


CAATTTCAGA 


CACATGATTC 


ACAGCACCAT 


1320 




TTGATCCTGG AATCACAGGA 


ACACCTGCAT 


GATGAACTGT 


TTGTCTTGCT 


GTTATTTTAT 


1330 


25 


CCCCCATCAT TTCCATCGTT 


TTTTTAGTAG 


GCCCTATAAA 


CGCTATGCCT 


TGTTCCTCAA 


1440 




CGGTTTGAGC AAATTTTGTT 


GATTCTGATA AAAAGCCATA TCCTGGGTGA ATTGCATTAG 


1500 




CACCAGTGAT TTGTGCAGCA 


GATATGATGC 


GGTCAATATT 


TAAATAACTA TCTAAAgCAT 


1560 


30 


TArcwTCCCC AATACATATA 


GCTTGATCTG 


CTAAATGTAC 


ATGCAAGCTT 


TGCTCGTCCC 


1620 




CTTTTGCATA AACTGCTACA 


GTTTCAATCC 


CATATTCTCT 


GCAAGCTCTT 


ATAATCCTTA 


1680 


35 


CAGCAATTTC ACCTCTGTTC 


GCAATTAAAC 


AACGAAGCAT 


TTACTTACCC 


CCTTTACTTA 


1740 


ATACGTACCA AAACTTGGTC 


GTATTCAACA 


TTTGTGCCAT 


GATCAGCTAC 


TATTTCAGTA 


1800 




ATTtCTCCAG caacatctgt 

■ 


TGTTACCTCG 


TTTAATACTT 


TCATCGCTTC 


AACATATCCT 


1860 


40 


ataatatctc ccttgttaac 


TTTGTCACCG 


ACATTCACAA 


TTGGTTCAGT 


TAATTCTTTA 


1920 


CTATCTTGTA AAAAGAATGT 


ACCTATCATT 


GGTGATTTAA 


TGTCATGATA 


ATCATTTGTC 


1980 




GAAACATCGG AGTTATCATT 


CGCTTTTGAA 


GCTGTCAAAT 


CATTATTGTT 


CATACTTTGA 


2040 


4o 


TTTGATTGAT TACTG7GTGC 


AGCCAAATGA 


TTCGAGTCAG 


TGAAGTCAAT 


TTCTATTTCA 






TCTTCAAAAT TTTTATATTT AAATTTCTTA ACATCATTTT 


CCTTCACTAA 


TTTGATTATT 


2160 




TGTTCGATTT nTTCAATATT 


CATTTTACAA ATCCCCTTTT AAAATTGTTG CTAATTTTTT 


2220 


50 


CGAAGTATGT CGCAAGCTAG 


ATGTATCAAA AATTGGAGTC 


TTTTGATGAC 


TCTTAAGAAT 


2280 




TTCATTAAAC AGAGACATTT 


GTTCCCGATT 


CTTATCTACA GCTTCTTGGA ATGATATCCA 


2340 
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TACAGTTGCA ATTTTGGTAT AACCACCTAT CGTTTGTTTA TCATTAAGCA GAATAATAGG 24 60 

TTGACCATCA TTTGGTACCT GAACACTACC AAGAGCAACC GGTTCAGAAA TGATATCTGC 2520 

TTGATTAAtT GGTGCAACGC TGTCACCTTC CAAACGATAG CCCATACGGT CTGATTGTTC 2580 

AGTAATTAAA TATGGATGAT TTACAATTTT CGCTCTAGCC TCTTCAGAAA ATGCCTCGAA 2640 

TTGAGGTCCT TGAAGAATGT GTATAATATT ATTTTCTGGC AATAAATCGT CCTGTAAATG 2700 

AATCGTCTTT CCAATGTTTT CTTTAAAGTC ATTATTTATT TTCACTGTTA TTACATCATC 2760 

AGCTAATAAC TTTCTACCTT TGAAT 2785 
(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1010 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 



25 


AATGGAAACG 


GTTGAAACA3 


CAATTATTAC 


TATTTCTATG 


GGTGAAGGTA 


TTTCAGAGAT 


60 




ATTTAAATCA 


ATGGGTGCCA 


CACATATCAT 


TAGTGGTGGA 


CAAACGATGA ATCCTTCTAC 


120 




AGAAGATATC 


G7TAAAGTCA 


TTGAACAATC 








.a. O w 


30 


TAATAAAAAT 


ATCTTAATGG 


CAAGTGAACA 


AGCAGCGAGT 


ATTGTTGATG 


CAGAAGCTGT 


240 




TGTTATTCCA ACGAAATCTA 


TTCCTCAAGG 


TATAAGCGCA 


CTATTCCAAT 


ATGATGTGGA 


300 


35 


CGCAACACTT 


GAAGaAAATA 


AAGCGCAAAT 


GGCTGATTCA 


GTAAATAACG 


TTAAATCTGG 


360 


TTCATTAACG 


TACGCTGTTC 


GTGATACGAA 


AATTGATGGC 


GTTGAGATTA 


AAAAAGACGC 


420 




GTTTATGGGC TTGATTGAAG 

* 


ATAAGATTGT 


AAGCAG CCAA 


AGTGATCAAT 


TAACAACGGT 


430 


40 


TACTGAGTTG 


TTAAATGAGA 


TGTTAGCAGA AGATAGTGAA ATATTGACTG 


TGATTATTGG 


540 


TCAAGATGCA 


GAGCAAGCAG 


TTACAGATAA 


CATGATAAAC 


TGGATCGAAG 


AGCAATATCC 


600 




AGATGTAGAA 


GTGGAAGTTC 


ATGAAGGTGG 


ACAACCAATT 


TATCAATATT 


TCTTTTCAGT 


660 


45 


AGAATAAAAA 


TTTAAAATAA 


AAAACTACCA ATGATAAATC ATCAGTTGGT AGTTTTTTAT 


720 




TTTGCTATTT 


TAGTGATATT 


GCGGGTTAAA 


AGTATCGTTC 


TCGAGTTGCT 


AACAATGTCA 


780 




TGTTCAACTT 


AGTCATGATA 


AAATAAATAA CATACTAAAT GATACGTAAA ATCAAATAAA 


840 


SO 


ACATAGGTGA 


TTTATTTTGG 


CTAAAGTAAA 


CTTAATAGAA AGTCCATATT 


CTCTTTTACA 


900 




ATTAAAAGGT 


ATAGGTCCTA 


AGAAAATAGA AGTATTGCAA CAACTAAATA TTCATACAGT 


960 
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(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1540 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



-t ft 


(xi) 


SEQUENCE DESCRIPTION: i 


SEQ ID NO: 


135: 








TGTAGTTGAA 


CATGAACAAC 


AAAAGAAAGA AAAGACAAAA AAGCAATACA AGCCATTTTG 


60 


15 


GAT7GTCATG AGTTTTATAA 


TACTTATAGT 


TGTACTATTA 


CTCCCGGCAC 


CTTCAAGTCT 


120 




GCCGATAATG 


GCTAAGGCAG 


TACTAGCTAT 


TTwAGCTTTT 


G C AGTTATT A 


TGTGGGTAAC 


180 




GG AAGCTGTA 


TCATATCCGG 


TGTCAGCAAC 


TiTAATTATT 


GGCTTAATGA 


TATTACTTTT 


240 


20 


AGGATTTAGC 


CCTGTTCAAA 


ATTTAGGGGA 


GAAGCTAGGT 


AATCCGAAAA 


GTGGCAGTGC 


300 




TATTTTAGCT 


GGAAGTGACC 


TTCTAGGAAC 


TAATCATGCA 


TTATCATTAG 


CGTTTAGTGG 


360 




ATTTGCAACT 


TCAGCTGTAG 


CTCTCGTTGC 


AGCTGCATTA 


TTTTTGGCTG 


CTGCTATGCA 


420 


25 


AGAAACGAAT 


TTGCATAAAA 


GACTAGCTCT 


TTTAGTGTTA 


TCAATTGTTG 


GTAATAAAAC 


480 




TAGAAATATA 


GTTATTGGAG 


CAATTATCGT 


TTCAATTGTA 


CTTGCATTTT 


TCGTTCCTTC 


540 




TGCAACAGCT 


AGAGCAGGGG 


CAGTTGTACC 


AATCTTGCTG 


GGTATGATTG 


CGGCATTTAA 


600 


30 


AGTTTCCAAA 


GATAGCAAGT 


TAGCGTCTTT 


ATTAATAATT 


ACTTCAGTAC 


AAGCTGTGTC 


660 




AATTTGGAAT 


ATTGGTATCA 


AAACGGCGGC 


AGCACAAAAT 


ATCGTAGCGA 


TTAATTTTAT 


. 720 


Jj 


AAACCATCAA 


TTAGGATTTG 


ATGTTTCATG 


GGGCGAGTGG 


TTCTTATATG 


CAGCGCCTTG 


730 


GTCCATAGTT 


ATGTCCGTAG 


CTTTATATTT 


CATCATGATT 


AAAGTGATGC 


CTCCAGAAAT 


840 




TAATSCAATA GAAGGTGGTA 


AAGATTTAAT 


AAAAGAAGAA 


TTGCATAAAC 


TTGGCCCCGT 


900 


40 


TAGCCCACGT 


GAATGGCGTT 


TAATTGTTAT 


ATCGATGTTA 


TTATTACTGT 


TTTGGTCAAC 


950 


TGAAAAAGTA 


TTACATCCGA 


TTGACTCTGC 


ATCCATTACT 


ATTATTGCTT 


TAGGTGTTAT 


1020 




GTTAATGCCG AAAATTGGTG 


TCATGACATG 


GAAACATGTT 


GAAAATAAAA 


TACCATGGGG 


1080 


45 


AACAATTATC 


GTGTTTGGTG 


TAGGTATTTC 


ACTAGGTAAC GTTCTTTTGA AAACAGGTGC 


1140 




AGCTCAATGG 


TTAAGTGATC 


AAACTTTTGG 


TGTTTTAGGT 


TTAAAACATT 


TACCTATTAT 


1200 




CGCGACAATT 


GCACTTATCA 


CGCTTTTTAA 


TATATTGATT 


CATTTGGGCT 


TTGCGAGTGC 


1260 


SO 


AACAAGTTTA 


TCATCAGCGT 


TAATACCTGT 


ii"ilATTTCG 


CTAACCTCTA 


CGTTACACTT 


1320 




AGG AG AC CAG 


TCTATAGGAT 


TTGTTTTAAT 


TCAACAATTT 


GTTATTAGTT 


TTGGTTTCTT 


1380 
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AGATTTCTTG AAGGCAGGTA TACCATTGAC AATTGTAGGG aATAtCCAgT GaTAGTTTTT 1500 
AGCATGACTT ATTGGAAATG GGTAAGGTTG CnTTAATTAA 1540 
(2) INFORMATION FOR SEQ ID NO: 136; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11823 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EONESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

ACTTCTCACA ATAAGAAATA TGAAATTGTT ATGTGTTAGT TGAGATTCAG TGATGAATTA 60 

CTTTTATCAT TTAAAATGTT GTTATCATTG TCATGCGTTA CCAAATCGCT TACGTATACA 120 

CGATTCCCAA TCTTAACATA GACGATTTGT ATATCAGAAT TTTCTGATTA CTA^CAGTTT 180 

ACCTAAGTTT AAATATCTGT TCAATGATTT TCAGTTATTT TTAAAAGAAA AATCGTAATG 240 

CTGCCATGAT AACAATCCCA CTAATAATTG TAATAGTTAA AtACGCGTGA TTATAGATAA 3 00 

2S AATAAGCGTC GGAATGAGCG CGATA a .TGTA AGGGATGTTT AATGTATACC CCTCACCATG 3 60 

AGGCGTCTGT TGAATAATGC TGTCAATGAC AAGTGCCGTA AATAGTGTGA TTGGGATAAA 4 20 

TGATAGCCAT CGAACCACGA CATCAGGCAA TTGCACTTTT GAAATCATGA TA^AAGGTAT 4 80 

30 AATTCGAATT AATAGCGTTA CGATACCACA CAATAAAATA AGTATTAACA TGTTCATATG 54 0 

AGTTATCATT GTTCCATCAT CACTCCTAAC GCTGCTGAAA TTGTGGCTGC AATTAATATT 600 

GCTAGATATG AAGG CATAAA CATACTTAGC GATAACATCA TTACTATGAC GGCAATAATG 660 

AGTACTATGT AAATTCTTAA TCGCGATTTA GTAATTGATT CAAATTGCGC AATGGCCAAA 720 

AAGATAAACA TAGCCGTGAT AGCAAAATCT AACCCTAGCG TTTGCGGATT TGAGATATAT 780 

7CGCCAAATA AAGCCCCAGC TACACATGAA ATTGCCCAAA AT AAAT AT GC TGTGATGTTA 84 0 

AG AC CATC CA TCCAACGATC ATTGATAGCT TCTCCTTTTA AATAAGGTGT AATGGCGACG 900 

CCAAACGTTT CGTCAGTTAC TAATGAACCT AATCCAACAC GGTTCCAAAA CCCATATGTC 960 

TTGAAGTTTG GTGCAAGCGA CATACTTAAA AGGAACATTC TTGAATTTAC GATAAATACA 1020 

GTTAGTACAA TCGCTGATAT AGGTGTACCT GCTATAAACA ACGCGCACAT AATAAATTGC 1080 

GCAgcaCCGG CATATATAAC AAGACATAAC AAGACAATTT CTAAAATACT AAAGTTTTGA 1140 

SO GACGAAG CCA CAATACCAAA TGAAATACCA ACACCGGCAT AACCCAATAA TGTTGGGATA 1200 

CACTCTTGCA CGCCTTGTCT AAAACTTAAA TGTGTTGTCA TCTCAATTAC CTCCTTTGCC 1260 
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TAAGCAATAA CATTAGACAT CAGTTTGTCT GAGGTTAGAC ATTCCGGAGT CTTTAGTCAG 
CTT CAT ATT A ACTTTTTATT TTTGAGAATT TTCAATTTTT TATTTAAGAC TACCTCCATA 
TTTTCTATGG aTTTGTAGTT GTTTTTAAGT ATCAATTTTA TAAATTTTTA TATCTGATGA 
TGAGTCTGGG aTATTGaTTC ATGTACCACT CCCTTaTaAT CATCCCCTCC CCCTaCCCTA 
CTCCATCGAT ATAACTCATA CTACATATCA ACGAAATCAG TATTTTATCG CTTCCTTTCC 
TATATTAGTG ATGCTCAAAC TTGTTACGTT TTAGATTGTT TTAGTTCATC ATAATTATCC 
CGTATTGTTG CTATAATGAA ATGCGTTCAC CCCATTAAAC CACAAACTTA ATTTATTGTT 
GTTATGTGCA TTGGCTCACT ATTATATTTT TACAGCACAA AAAAAGTGGC GACAGTTCGT 
CACCACTTTT TAAAATATTA TTTAAAGTAT CTTGCCCTTG CTTTAAGTAT ACGTAGATAT 
ATACTTTTTA AAGCTTGTAG CTAAAGCCTT TATTTAACTG GTTTTGAAAT TTGTGTTTTA 
CCACCCATAA ATGGTACTAA TGCTTCTGGA ATTGTTACTG TTCCATCTTC ATTTTGGTAA 
ITTTCAACAA TAGCAGCAAA TGTACGTCCA ACTGCTAAAC CACTACCATT TAATGTATGT 
3CTAATTCTG GTTTAGCTGC TTTGTCACGC TTGAAGCGGA TGTTAGCACG ACGCGCTTGG 
fcAATCCGTAC AGTTTGAGCA TGAACTAATT TCTTTATAAT CATTGTAGCT TGGTAACCAA 
&CTTCTAAAT CATATGTTTT GCTTGCACTA AATCCAATAT CACCTGTACA TAAAATAACA 
CGACGGTATG GTAAACCTAA CTCTTCTAGA ATTGCTTCTG CGTTTGTTGT CATTTCTTCT 
\AAGCATTCC ATGAATCTTC AGGTTGTTCA AAACGTACCA TTTCCACTTT ATCGAATTGA 
TGTAAACGAA TTAATCCTCT TGTATCTCTA CCTGCTGATC CTGCTTCACT AC GG AAA CAT 
3CAGATTGAC CAGTGAATTT TTCAGGAAGT ACACCTGGTT GAATAATTTC ATTACGGTAG 
VAATTCGTTA ATGGTACTTC AGCAGTTGGA ATTGTATATA ATCCTTCTTT TTCTACTTTA 
\ATAAATCTT CTTCAAATTT AGGTAATTGA CCTGTACCAT ACATTGTATC TGCGTTCACA 

* 

\GCTGTGGTA CCATCATTTC TCTATAACCA TGTTGTGTTG TATGTTTTGT AATCATATAG 



CTCATTAAAG CACGCTCTAA TTGCGCACCT TCATTTGTTA AATATACAAA ACGCGCACCT 
3AAACTTTTG CTGCACGATC AAAATCAGCC ATTTTCAATT CTTCTACAAT ATCCCAATGT 
5CTTTGGGTT CAAATGAAAA CTCaCGTGGT GTACCCCACT TTTTAACTTC AACGTTATCT 
CCATCAGATT CACCTTGAGG TACATCATCA CTTATTAAAT TTGGAATACG ACAAAGGATA 
;CTGTCATTT TATTATCAAT TTCATTTAAT TGACTATCTT TTTCTTTAAT ATCGTCACCT 
IATGTGCGCA TTTCAGCAAT CACATCATCA GCATTTTCTT TATTACGTTT TTTTAATGCG 
VTTTCTTCGC TTACTTTATT ACGACGTGCT TTCATTTCTT CTGTTGCACT AATTAATTTA 



1330 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280' 

2340 

2400 

2460 

2520 

2580 

2540 

2700 

2760 

2820 

2880 

2940 

3000 

3060 



10 



15 



20 



25 



30 



35 



40 



45 



( 
i 



50 



i 

e 



1 

c 
c 
1 
c 



55 



723 



EP0 786 519 A2 



10 
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TCAATTTTGC TCTTAACTGT GTCAGGCTCA TTTCTGAATA ATCTAATGTC TAACATTAAC 31B0 

CTTCATCCTT TCCCAAATAA TTATCATTTA TTA7GGAATG ACGTACGTCT TTATTTTTTA 3240 

GAAAATAAAA AAAGACCACA TCCCTACAAG GGACGTGGTC TACGCGTTGC CACCCTATTT 3300 

AACAATTTAA GTTATAAAGA TACACTAAAC CTAAATTGCA CTTCACTAAA ATAACGGTTA 3360 

TCACCGATTG TTCTTTTAAA TTAAGTAGGT AGATTCATAT ATATGTTGAT TCTTGTTCAC 3420 

ACTAACCACA AGCTCTCTGA TATCGAACAC TATATATTAC TTGTCCTACG AACAATGTCT 3480 

TATTAAGTTA TTTTT A ATAT AGCAAACTAT ATTTGCTTTT TCAAGTAACG ATTTCAAACA 3540 

TCACTCATGT CGATTTAGTG AGATGCAGTC GTTTGATAAA TTGATTGCTT TAAATACTGT 3600 

GCAACCCCTT CAATATCTTT ATGAAATTGA CGATCATGTG TAATGGATGG CACGATACTT 3660 

CGAAACTCAT CATACTTGCG ACGTGTTTTT GGTGATAATC CTTCAACACC TTTTAACTCT 3720 

GCTGCTTGTA ATGCAATAAC ACATTCGATT GCCAGCACAC GTCTTGCATT TTCAATAATT 3780 

TGATAACCAT GTCTAGCAGC TGTAGTTCCC ATAGATACGT GATCTTCTTG GTTCGCAGAT 3 84 0 

GAAGTGATAG AA7CAACAC7 CGCTGGATGC GCTAAAGTTT TATTTTCAGA AACGAGACTT 3 900 

25 GCAGCAGCAT ATTGCATAAT CATCGCGCCA CTTTGCAATC CTGGCTCTGG ACTAAGAAAT 3960 

GCTGGTAAAT CACCATTTAA TTGAGGATTT ACTAGTCGCT CTAGACGACG TTCCGATACG 40 20 

TTTGCTAATT CACTTACACC TAATTTAAGA TGATCTAATG CAAAAGCAAT AGGTTGTCCA 4080 

30 TGGAAGTTAC CACCTGAAAT AACAAACGTT TCATTTGCTT CCT CAAATAT AAGTGGATTA 4140 

TCATTAGCCG CATTCATTTC AAATTCTAAT TGCTGTTTAA CATAATTGAA TACTTGAAAA 4200 

CTCGCGCCA7 GGATTTGTGG TATACAACGC AACGTATATG CATCTTGTAC ACG7ATTTCT 42 60 

GATTGTCGCG TCGTTAATGT TGATCCTTCT AACCAATCAC GCATACGCGC TGCCACATTA 4320 

ATCTOTTCTT GAAAA7TACG AACTGCGTGC ACATCATGTC GATATGCATC TATAATGCCA 43 80 

77AAGAGAC7 GATGCGTTAA TGCAGCAATC CAT7CAGATT GG7AACCTAA A7CT7C7GCT 44 4 0 
7C7A7ATAAC 7AATGACACC 77GAGC7G7C A7AGC77GCG TACCATTAAT CAATGC7AAA " 4 500 

CCTTCTTTAG CC7GAAGG77 CAAAGG7TGT CTATT7AATT CTCT7AATAC A7CGTCACTA 4560 

TCCTTTTCTT CCCCTCTGTA CAA7ACTTTC CCTTCACCAA TTAATGCTAA TGCTAAATGT 4620 

GATAA7GGCG CTAAATCTCC TGA7GCACCG AGAGAGCCTT GC7GTGGGAT 7ATCGGTATA 4680 

A7ACGT7CAT TTATAAAAAA TTGTAATTGT CTCACTAATT CTAAAGTGGC ACCTGAATGA 4 740 

CC7TTTAA7A ATGTATTCAA TCGTAAAATC ATCATGACTA ATGCTACTTC TTTTGAAAAT 4800 

GGCTCACC7A GTCCACAGGC A7G7GAGCG7 A7CAGA7TCA CTTGTAATTC ATTATATTGC 4 860 
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TCCTCATTTT CAATAATACG TTCAACTACC 
TCATCAATAA TTTCAATCTT TGATTGTTGT 
5 AGTGTTTCAC CATCTAAATA TAAAGTCATA 

CCATCCTTCT TGAAGTATAC GTTTTCATTT 
TAACCTATTA TCAGAGCACT ATTGTAGTGC 

10 

ATTTAATAAT TTATCTATTG ACGAATTGCA 
AACGACAAAT AATAATGAAT TCAGAAATTT 
TTTTTAAAAT TGAATAAATT CGGAAAAGGC 

1$ 

ACGTTAAATT CTATTAACAT TCCTAAGCGT 
AAAATTTTAT TAATTGGTGG ATCTGCTAAC 
GCATGTGTAT TTAGCGGTAG TGG7TTAATC 

20 

GCATTACATT CTCGTTGCCC AGAAGCGATG 
ACGAAAATGA TTGAAATGAC TGACAGTATA 

25 AAAGGAAATA ATGCCATTAC ATTCCTACTA 

GTAGACGGCG ATGCGATTAC AATCTTTAGT 
GTGATCTTTA CACCACACCT CAAAGAATGG 

30 CAGACATATG AGCGTAATCG TGAAGCAGTT 

AAACATGGTA CTGAAATTTT CTTTAAAGAT 
GCAATGGCGA CTGGTGGTAT GGGCGATACA 

35 CAATTTGATA ACTTAAAAGA AGCGGTTATG 

GAAA£CCTTG CAAAAGATAT GTATGTGGTG 

TACGCAATGA AACAATTAGA AAGTTAGTCA 

40 „ 

CTTTC7AGCA TAAAAATAAG ACTCCCCTAC 

TCATCTGATG ATTGTTGTAT ATCTTCTTCA 

ATACGTCCAT CTTCATCATT TTCTTCTGAA 

45 

GGTGTTTCAT CATTTaCAAC CGCTTCACGT 
GAAGTAGATT GCTCATCTTC ATTCGTTTCA 
GTTGAAACAA ATTGATCATC ACCTAAGCGA 

50 

TTTTGAGAAA TATCTGCAAC ATCTAGTCGA 



GCTCTACTTT TTTTGACACG TTCTAACGCA 4930 

TGTAAAAATG ATTTAATATC CTCAATTGTT 5040 

TATGTTACCC CCTTGTTTAT ATTAAGTAAC 5100 

TTATTGAAAC AATGGTTTTA CGTACATTTA 5160 

GTTAAAGGAT ATTAAGATTG TTGTAAGCAT 5220 

TATACAGGTA TAGTATTTTC TATTGTATTT 5280 

ATAATACATT TTGTTAAAAG TTACTATATA 5340 

TTTTACATGG GAGGTTATAT CACTATGGAA 54 0 0 

AAAGAAGATT CACATAAAGG TGATTATGGC 54 60 

TTAGGTGGTG CCATTATGTT AGCGGCTCGT 5520 

ACTGTAGCTA CACATCCAAC AAATGYTTCA 5 580 

GTTATTGATA TTAATGATAC GAAAATGTTG 5 64 0 

CTAATTGGTC CAGGTCTTGG CGTTGATTTC 5700 

CAAAATATAC AACCGCATCA AAATTTAATC 5760 

AAACTGAAAC CGCAATTACC TACATGTCGT 5320 

GAACGATTAA GTGGTATTCC TATTGAGGAA 5380 

GATCGTTTAG GTGCAACTGT TGTACTTAAA 5 940 

GAAGACTTTA AATTGACAAT CGGTAGCCCA 6 000 

CTTGCTGGTA TGATTACAAG CTTTGTCGGT 60 60 

AGTGCCACAT ATACACATAG TTTTATTGGC 612 0 

CCACCATCAA GACTTATCAA TGAAATACCT 6180 

TTACTAATCA TTGAATATAG TAAAG CATTA 6 24 0 

ATATAGGGAA GTCTTATTTT TTATTATTCT 6300 

ACACGATCCA TGAAATCTTG TCTTACTTCA 6360 

TCAATCACTT CAGTATGAAT TGCATTTCCT 6420 

TGTTGTTCAG TACCATCTTC AGATACAGTT 6480 

TCTTCTGCAT CTTCTTTTAC TTTAGCAACC 654 0 

ATTAAGCGAA CACCTTGTGC TGCACGACCA 6600 

ATAATGACAC CTGCATTAGT AACAATCATT 6660 
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GTAGCTGTTT TAATACCTTT ACCACCACGA TTTGATAAGC GATAGTCATT AACTGGCGTA 
CGTTTACCAT AACCATTTTC AGTAACTACT AATACTTCAT CAACACTGTT TGCATGAGCT 
ACATCAAGCC CTACAACTTC GTCACCTTCA CGAAGTGTAA TACCTTTCAC ACCCGTTGCT 
GTACGGCCTA AAGGACGTAA TGTTGATTCA GGGAATCGAA TTAATGATGC ATGTGATGTA 
CCAATCAAGA TATCTTCTTG ACCACTTGTT AAGCGAACTG CAATTAACTC ATCATCTTCT 
CTGAACGAAA TCGCAATCTT ACCATTTCTA TTTATTCTTG AGAAGTTACT TAATGCTGAA 
CGTTTAACGA CACCACGTTT AGTTGCAAAC ACTAAGAAGT TGTCTTCACT TTCAAGGTCT 
TTAACAGCAA TCATTGTACT AATGACTTCA TCATTTTCAA GTTCAATAGC ATTCACTACA 
GGAA7ACCTT TAGACTGTCT TGATAACTCA GGCACTTCGT AACCTTTAAG TTTGTATACA 
CGACCTTTGT TAGTAAAGAA CAATACATGG TCATGTGTAC TTAAAGTTAC CAATTGACTG 
ACAAAATCTT CTTCCAATGT ATTCATACCT TGAACACCAC GACCACCACG GTTTTCACCA 
CGATATGTAG ATACCGGCAA ACGTTTAATG TAGTTATTAT GGCTTAGTGT AATTACTATT 
TGTTCTTCTG GAATTAAGTC TTCGTCCTCT AAGTCTTCAA ATCCACCTAA TTGAATTTCT 
GTACQACGAT CATCACCGAA ACGATCTCTA ATTTCAGTCA ATTCATCTCT AACTAACTGT 
AATAACACTT CTTCATCAGC TAAGATTGCT TCTAATTCAC TAATATAATT TAATAACTCA 
TTATATTCAG CTTCAATTTT GTCTCTCTCT AAACCTGTTA GACGTCTTAA ACGCATGTCT 
AAAATAGCTT GAGCTTGTTT TTCAGAAAGT TTGAAGCGTT GTTGCAAGCT TTCCATTGCA 
ACTTTATCTG TATCTGACTC ACGAATCGTT GAAATAATTT CATCGATATG GTCAAGTGCG 
ATACGTAATC CTTCTAAAAT GTGGGCACGA TCTTTAGCTT TACGTAAgTT GTATTGCGTA 
CGTCTTCTAA CAACTGTCTT TTGATGCTCT AAATAATGTA CCAACGCTTC TTTTAAATTA 
ATAAGCTTCG GTCTACCATT TACAAGTGCA ATCATATTCA CACCAAATGA TGTTTGAAGA 
GGTGTTTGTT TGTATAAGTT ATTTAAAATG ACACTAGCA7 TTGCATCCTT ACGCACATCA 



SO 



ATAACGACAC GCACACCAGT ACGTAAACTT GTTTCA^TCAC GTAAATCAGT GATACCGTCA 
ATTTTCTTGT CACGAACGAG CTCTGCAATT TTTTCAATCA TACGAGCCTT ATTCACTTGG 
AAAGGAA7TT CAGTGACAAC AATACGTTGA CGTCCGCCTC CACGTTCTTC AATAACTGCA 
CGAGAACGCA TTTGAATTGA ACCACGACCT GTTTCATATG CACGTCTAAT ACCACTCTTA 
CCTAAAATAA GTCCAGCAGT TGGGAAATCA GGACCTTCAA TATCCTCCAT TAACTCAGCA 
ATTGAAATAT CAGGGTTCTT ACTTAAGCTA AGTACACCAT TGATTAATTC ' TGTTAAGTTA 
TGTGGTGGAA TATTCGTTGC CATACCTACC GCGATACCTG ATGCACCATT GGCTAATAAG 



6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7250 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

3040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 
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AAATCTATTG TATCTTTATT AATATCACGT AACAGTTCAA GTGTGATTTT AGTCATACGC 8580 

GCTTCAGTAT AACGCATTGC TGCTGCGCCA TCTCCATCCA TTGAACCAAA GTTACCTTGG 864 0 

CCATCAACAA GCGGATAACG ATAACTGAAA TCTTGAGCCA TACGTACCAT TGCTTCATAA 8700 

ATAGATGAGT CACCATGAGG GTGATATTTA CCCATTACGT CACCAACGAT ACGTGCTGAT 8760 

TTTTTATATG ATTTATCCGG TGTCATACCT TGTTCATTTA ATCCATATAG TATACGACGA 8820 

TGTACTGGTT TTAAACCGTC ACGAACATCT GGCAATGCAC GAGCAACGAT AACACTCATC 8880 

GCATAATCTA AAAATGATTC ACGCATTTCA CTGGTAATAT TTCGTTCATT TATTCTTGAT 894 0 

TGAGGTAATT CAGCCATCAA GAGTTCCTCC TTCAAAAGTT CAGTTCACAG CGCTTAGAAG 9000 

TC7AAGTTTG CATAAACTGC ATTATCTTCT ATAAATTGTC TACGGTTTTC TACAACGTCA 9C50 

CCCATTAACA TTTCAAATGT TTGGTCCGCT TCAATCGCAT CTTCAAGTTT 7ACTTGTAAA 9120 

AGAGCGCGGT GCTCAGGGTT CATTGTTGTT TCCCAtAATT GATCTGCATT CATTTCTCCA 9190 
AGACCTTTGT ATCGTGCAAT AGACCATTTT GGTGTTGGAT TCAATTCAGA TTTAAGTTTA ' 924 0 

TCAAGTTCCC TATCATTGTA TACATAATAC TTTTGTTTAC CTTGTGTCAG TTTATACAAC 930 0 

GGTGGCTGTG CAATATACAC ATAGCCTGCT TCAATTAACG GTCTCATAAA TCGATAGAAG 93 60 

AATGTTAATA ACAATGTTCT AATATGCGCT CCATCCACAT CGGCATCAGT CATAATGACG 942 0 

ATTTTGTGAT ATCTTGCTTT CGCTAGATCA AAGTCGCCAC CGATTCCTGT ACCAAATGCT 94 8 0 

GTGATCATTT GACGAATTTC ATTGTTATTC AAAATTCTAT CTAATCGTGC TTTTTCAACA 954 0 

TTTAATATCT TACCTCGTAA TGGTAAAATC GCCTGCGTTC TAGAGTCACG ACCAGATTTT 960 0 

GTAGACCCCC CGGCAGAGTC CCCTTCGACT AAGAAAATCT CACATTCTTC AGGACTTTTA 96 60 

CTAGAGCAAT CGGCTAATTT ACCTGGAAGG CTTGCTACAT CTAACGCTGA TTTACGACGT 972 0 

GTTACTTCAC GCGCTTTTTT CG CAGCAACA CGTGCACGTG CCGCCATAAT ACCTTTTTCA 97 8 0 

* 

ACCACTGTAC GTGCGACTTG TGGATTTTCA TATAAAAATC GTTCAAAGTG CTCTGAGAAT 934 0 

AATTTATCTA CAACTTGACG CACTTCAGAA TTACCTAATT TTGTCTTCGT TTGACCTTCG 99 0 0 

AATTGAGGAT CACCATGTTT GATAGATATA ATTGCTGTCA TACCTTCACG TGTATCTTCA 996 0 

CCAGAAAGTC TATCTTTTTC TTCTTTCATA ATCTTGCTAC TTAAACCATA ACTATTTAAG 10020 

ACACGCGTTA ATGCACGTTT GAATCCGTCT TCATGCGTAC CACCTTCATA CGTATGAATG 10080 

TTATTTGCGT AAGTTAAAAG ATTTGTGGCA TATCCTGAGT TATATTGAAT CGCAATTTCT 10140 

ACTTCAATAT CATCTTTAGA TTGATGAATA TAAATTGGCT CATCATGAAT AGGTTCTTTA 10200 

TTTTCGTTCA ATAACTCAAC GTACGATTTA ATACCGCCCT CATAGTGATA GGAGTCTTCT 10260 
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GCAAGCTCTC TAATACGCTG CTGTAATGTT TCATAGTTGT ATACAGTTGT CTCTGTGAAG 103 80 

ATTTCTCCAT CTGCTTTAAA ACGAAtGaCA GTACCTGTCT TAtCAGTnGT GCCAACTTCT 10440 

TTTAAGTCAA ATTGAGGTAC ACCTTTTTTA TATGCTTGAT GATATATAGT CTCATTTCTG 10500 

TGTACATATA CTTCTAAGTC TTGTGACAAT GCGTTTACAA CTGATGAACC AACACCATGT 10560 

AAACCACCAG ATACTTTGTA TCCGCCACCG CCAAATTTAC CACCAGCATG TAAAACAGTT 10620 

AAAATAACTT CGACAGCTGG ACGTCCCATT TTTTCTTGAA TATCAACTGG GATACCACGT 10680 

CCGTTATCCG TTACTTTAAT CCAGTTATCT TTTTCAATAA CAACTTCAAT TTGATTTGCA 10740 

TAACCaGCTA ATGCTTCATC GATACTATTA TCGACAATTT CCCACACTAA ATGGTGCAAA 108 00 

CCTCTCTCTG AA3TCGATCC TATATACATA CCTGGTCTTT TACGTACTGC TTCTAAACCT 10360 

TCTAATACTT GTATTTGCCC AGCACCATAA TTATCCGTGT TGTTTACATC TGACAATGCA 10520 

GTCACCATCG CTTTCTGTTA CTTTATAATT TCACCTTGAT TAATACGATA CAATTTAGCG 105 80 

TTATTCATGA TTTCATGATC AATACCATCT ACAGATGTCG TAGTGACAAA TGTTTGTACT 110 40 

TTATGCTGAA TCGTACTTAA TAAATGCGTT TGACGCGAAT CATCTAATTC ACTGAGTACA 11100 

TCGTCTAATA ATAAGATGGG ATATTCCCCA ACTTCGATAT TCATTAACTC AATTTCAGCT 11160 

AATTTAATGG ACAAAGCCGT TGTACGTTGC TGTCCTTGAG AACCATATGT TTGAGCATCC 11220 

ATGCCATTCA CATCAAAACT TATATCATCT CGATGTGGTC CGAATAAGCT AATGCCTCGT 11280 

TCTTTTTCTC TTTGCATATT ATCGCTAAGA ATAGACATAA TTTCTTCAAG TCGTGCCGCT 113 4 0 

TCATTTTGAG CATAATCAAA TTTAAGACTA GGTAAATAAT TCAG CGACAA CGC TT CTTTA 114 00 

TCATTTGTGA TACCAGCATG AATCGGTTTA GCTAACGACT CTAGCTCTTG AATAAAATGT 11460 

GCACGTTTAT CAGTTACTTT CATTGCATAT TCAGCAAACT GCTGATTTAA TACTTCCAAC 11520 

11580 



ATTGTTAAGT cc 



:g gcctaattgt aactgcttta agtaattatt CTTTTGCTTT 
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aaaatacgtt ggtattgagc taaatcattt aagtaaacag cacaaatttg GCCCAACTCC 

ATATCTATAA AGCGTCGTCT TATTtGrGGr GAGCCTTTTA CAATATTCAA ATCTTCTGGC 
GCAAATAGAA CCACATTGAG GTGTCCAATA TATTGAGTTA GACGACTTTG CTCTAAGTGn 
ATTCACTTTG GACTTGTTTA CCTTTnTTAG TTATAAACAT TGTTAATGGG CATCGTGCCG 
TGT 

(2) INFORMATION FOR SEQ ID NO: 137: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 92 base pairs 

(B) TYPE: nucleic acid 
<C) STRAND2DNESS : double 



w T \J 



117C0 
11760 
11820 
11823 



55 



728 



EP0 786 519 A2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 



ATAATTATTA 


ACATGGTGTG 

nwv*vv a \* aw 


TTTAGAAGTT 


ATCCACGGCT 


GTTATTTTTG 


TGTATAACTT 


AAAAATTTAA 


GAAAGATGGA 


GTAAATTTAT 


GTCGGAAAAA 


GAAATTTGGG 


AAAAAGTGCT 




r*a AHA AAAAT 


TATCAGCTGT 


AAGTTAGTCA 


ACTTTCCTAA 

n A A A W W A 


AAGATACTGA 


ur«w 1 1 1 J\Wv~sj 


ATTA A AG ATG 


GTGAAGCTAT 


CGTATTATCG 


AGTATTGCTT* 


TTAATGCAAA 


*i " T*r*r°* 1 t " i 1 iv a 


r* Ji A PA AT ATG 




r rAAGfAATr 


A 1A1 A A Wl A w 


TTGTAGGCTA 


lunAul 


LLJLftLi 1 i/\ 


TT A PTAPTGA 


AGAATTAGCA 


AATTATAGTA 


ATAATGAAAC 


TGCTACTCCA 


AAAGAAACAA 


CAAAACCTTC 


TACTGAAACA 


ACTGAGGATA 


ATCATGTGCT 


TGGTAGAGAG 


CAATTCAATG 


CCCATAACAC 


ATTTGACACT 


TTTGTAATCG 


GACCCGGTAA 


CCGCTTTCCA 


CATGCAGCGA 


GTTTAGCTGT 


GGCCGAAGCA 


CCAGCCAAAG 


CGTACAATCC 


mTTATTTATC 


TATGGAGGTG 


TTGGCTTAGG 


aAAAACCCAT 


TTAATGCATG 


CCATTGGTCA 


TCATGTTTTA 


GATAATAATC 


CAGATGCCAA 


AGTGATTTAC 


ACATCAAGTG 


AAAAATTCAC 


AAATGAATTT 


ATTAAATCAA 


TTCGTGATAA 


nA 







(2) INFORMATION FOR SEQ ID NO: 138: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



ATACTGTAGC 


GCAAATTTCA 


CAATGGCATG 


TTATAGAAGA 


TTTAGTTACG 


AATGAATTAG 


GTATT AG TAT 


TTTACCAACA 


TCAATTTCAG 


AgCAACTAAA 


TGGAGATGTG 


AAGCTGtACG 


CATTGAAGAT 


GCTCATGTAC 


ATTGGGAATT 


AGGTGTTGTT 


TGGAAGAAGG 


ATAAACAATT 


AAGTCATGCC 


ACAACGAAAT 


GGATAGAATT 


TTTGAAAGAC 


CGTTTAGGCT 


AACATATTAA 


TAAAGCACTC 


ATTATTTAAG 


GCGCATCATT 


ACGTGGGTCA 


TTGAAATAAT 


GAGTGTTTTT 


TTGTGAAAAT 


GAAGTGAAAT 


TTAGAGAGCG 


TT7CCATAGA AAATAGTAAT 


ACAAACTATA 


AAAAAAGAGT 


ATTTTTATAT 


TGTGTACGCC 


ATCTTTATAA 


TAGTTATTGT 


AACAATTTAG 


ACATATTTAG 


AAAGGGATGG 


CGCCATGCAC 


AAAGTCCAAT 


TAATAATCAA 


ACTACTACTA 


CAACTAGGAA 


TCATCATTGT 


GATTACTTAT 


ATTGGCACAG 


AAATTCAAAA 


GATTTTTCAT 
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ATTGTACCGC TAACTTGGGT AGAAGACGGT GCAAACTTTT TATTAAAGAC GATGGTCTTT 6 60 

TTCTTCATAC CGTCAGTTGT AGGtATTATG GaTGtgCTTC CGAAATTACG CTAAATTATA 720 

TACTCTTTTT CGCAGTCATT ATCATAGGAA CATGTATCGT TGCATTATCT TCAGGTTATA 780 

TTGCTGAAAA AATGTCyGtT AAACwTAAAC ATCGTAAAGG TGTAGACGCt TATGAATGAT 840 

TACGTGCAAG CCTTATTAAT GATTTTGTTG ACTGTCG7TT TATATTATTT CGCTAAAAGG 900 

TTACAACAAA AATATCCGAA CCCATTTTTG AATCCAGCAT TAATTGCATC TTTAGGAATT 960 

ATTTTTGTCT TACTTATCTT TGGAATTAGT TATAACGGGT ATATGAAAGG TGGCAGTTGG 1020 

ATCAACCATA TTTTAAACGC AACGGTCGTA TGTTTAGCGT ACCCACTTTA TAAAAATAGA 10 80 

GAGAAAATTA AAGACAATGT CTCTATCATT TTTGCAAGTG TATTAAcTGG CGTCATGCTG 1140 

AATTTCATGT TAGTGTTCTT AACACTTAAA GCATTTGGCT ATTCTAAAGA CGTCATTGTA 12 00 

ACGTTATTGC CCCGATCTAT AACAGCCGCA GTAGGTATCG AAGTGTCACA TGAACTAGGT 1260 

GGTACAGATA CGATGACCGT ACTTTTTATT ATCACAACGG GTTTAATCGG TAGTATTTTA 1320 

GGTTCGATGT TATTAAGATT TGGAAGATTT GAATCTTCTA TCGCCAAAGG ATTAACGTAT 13 8 0 

GGGAATGCGT CACATGCATT TGGCACAGCT AAAGCACTAG AAATGGATAT TGAATCCGGT 144 0 

GCATTTAGTT CAATTGGGAT GATTTTAACT GCAGTTATTA GTTCAGTGTT AATACCTGTT 150 0 

CTAATTTTAT TATTCTATTA ATTTAGATAT TTAAAATGAT AGACAGAAAG GGAGGCTATT 156 0 

AGTAATAATG GCAAAAATAA AAG CAAATGA AGCATTAGTT AAAGCATTAC AAGCaTGGGA 1620 

TATAGATCAC TTGTATGGTA TTCCAGGAGA CTCAATCGAC GCATAGTCGA TAgTTTACGT 1680 

ACAGTGAGAG ATCAATT7AA ATTTTATCAT GTACGTCATG AAGAAGTAGC AAGCTTAGCG 1740 

35 GCTGCTGGTT ACACAAAATT AACTGGTAAA ATCGGTGTGG CATTAAGTAT CGGTGGCCCT 1800 

GGTTTAATTC ATTTATTAAA TGGTATGTAT GATGCCAAAA TGGATAATGT ACCGCAATTA 1860 

ATATTATCTG GACAAACGAA TAGTACAGCA CTTGGAACGA AAGCATTCCA AGAAACAAAT 1920 

40 TTACAAAAAT TATGTGAAGA TGTAGCCGTT TATAATCACC AAATTGAAAA AGGTGACAAT 1980 

GTGTTTGAAA TCGTTAACGA AGCAATTCGT ACGGCATATG AACAAAAAGG TGTAGCTGTT 204 0. 

GTTATTTGTC C7AACGACTT ATTAACTGAA AAAATTAAAG ATACAACGAA TAAACCAGTA 2100 

GATACATCAA GACCAACAGT AGTATCACCA AAATATAAAG ACATCAAAAA AGCGGTTAAA 2160 

CTAATTAATA AAAGTAAAAA GCCTGTCATG TTAATTGGTG TAGGTGCGAA ACATGCGAAA 2220 

GATGAGCTAC GTGAATTTAT TGAAATGGCT AAAATTCCTG TCATTCATTC ATTACCAGCT 2280 

AAAACAATCT TGCCGGATGA T CATC CAT AT AGTATCGGtA ACTTAGGTAA AATCGGTACC 234 0 
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CCATATGTGG ATTACTTACC TAAGAAAAAT ATTAAAGCCA TTCAAATTGA CACAAATCCT 


2460 




AAAAATATCG 


GACATCGTTT CAATA7TAAT 


GTAGGAATTG TTGGAGATAG 


TAAAATTGCG 


2520 


5 


TTGCATCAGT 


TAACTGAAAA TATTAAACAT GTTGCTGAAA GACCATTCTT AAACAAAACG 


2580 




TTAGAACGTA AAGCGGTTTG GGATAAATGG ATGGAACAAG ATAAAAATAA TAATAGTAAA 


2640 




CCATTACGTC 


CAGAACGATT AATGGCATCA ATCAATAAAT TTATTAAAGA 


TGATGCAGTG 


2700 


10 


ATTTCAGCAG 


ATGTAGGTAC AGCAACAGTT 


TGGTCAACTC GATACTTAAA 


CCTTGGTGTA 


2760 




AATAACAAGT 


TCATCATTTC AAGTTGGTTA 


GGTACAATGG GTTGCGGTCT 


TCCAGGTGCA 


2820 




ATTGCATCAA AAATTGCATA TCCAAATAGA 


CAAGCCATCG CAATTGCTGG 


TGACGGTGCA 


2880 


15 


TTCCAAATGG 


TAATGCAAGA CTTCGCTACA 


GCAGTACAAT ATGATTTACC 


TTTAACTGTA 


2940 




TTTGTACTTA 


ATAACAAACA GTTAGCATTT 


ATTAAATATG AACAACAAGC 


AGCTGGTGAA 


3000 


20 


TTAGAATATG 


CAGTTGATTT TTCTGATATG 


GATCATGCAA AATTTGCTGA GGCAGCAGGT 


3060 


GGTAAAGGTT 


ATACAATTAA GAGTGCTAGC 


GAAGTAGATG CTATAGTCGA 


AGAGG C ATT A 


3120 




CCACAACATG 


TACCAACGAT TGTAGATGTA 


TATGTTGATC CTAATGCTGC 


GCCATTACCA 


3180 


25 


GOTAAAATTG 


TAAATGAAGA AGCGCTTGGT 


TATGGTAAGT GGGCATTTAG 


ATCAATTACT 


3240 


GAAGATAAAC 


ATTTAGATTT AGATCAAATT 


CCACCAATTT CAGTGGCAGC 


AAAACGTTTC 


3300 




TTATAACTGA 


TTTAAAGGTT ATCACAATTG 


AATTGAACTA TAAAAACGGT 


AATTTCTATT 


3360 


30 


TCAACAAAAT 


GGGAATTGCC GTTTTOTTTA 


TTTATCACAA ATGATCGTAC 


TGAATTGATG 


3420 




ATAAAATTGT 


GAAAAAGTTG TTGAAAACGC 


TTTTACAAAT ATGTATAATA 


GCTATGAATT 


3480 




AGATATCACT 


TGCGTGTTAC TGGTAATGCA GGCATGAGCA AACAACCGCA 


CTATGAGAAT 


3540 


35 


AGTCTTGTTT 


GTTCATGCCT GCTTTTTTTG 


TACATGGAAG CGGAAATTGA 


GATAGGGGAT 


3600 


• 


GTTTSTATGT 

• 


TTAAGAAATT GTTTGGACAA 


TTGCAACGTA TCGGTAAAGC 


ATTAATGTTA 


3650 




CCTGTTGCGA TTTTACCAGC AGCTGGTATT 


TTATTAGCGT TTGGTAACGC 


AATGCACAAC 


3720 


40 


GAACAATTAG 


TAGAAATTGC ACCATGGTTA 


AAAAACGATA TCATTGTAAT 


GATTTCGTCG 


3780 




GTCATGGAAG CAGCAGGACA AGTTGTATTT GATAACTTGC CATTATTATT 


TGCAGTTGGT 


3840 




ACAGCACTTG GATTAGCAGG AGGAGACGGT GTTGCAGCAT TAGCAGCGCT AGTAGGTTAC 


j y u u 


45 


TTAATTATGA ATGCAACAAT GGGGAAAGTG 


TTGCACATTA CAATTGATGA 


CATTTTCTCA 


3960 




TATGCCAAAG GGGCAAAAGA ATTAAGTCAA GCAGCGAAAG AACCAGCACA TGCTTTAGTA 


4020 




7TAGGTATTC 


CAACGTTACA AACGGGTGTG 


TTTGGTGGTA TTATCATGGG 


TGCTTTAGCC 


4080 


SO 


GCATGGTGTT 


ACAACAAATT TTATAATATT 


ACACTACCAC CATTTTTAGG 


ATTCTTTGCA 


4140 
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AGCTTTGCGT 


GGCCACCAAT 


TCAAGATGGA TTAAATAGTT 


TATCGAATTT 


CTTATTAAAT 


4260 




AAAAATTTAA 


CATTAACAAC 


GTTTATATTC 


GGTATTATTG AACGCTCATT AATTCCATTT 


4320 


5 


GGTTTACATC ATATTTTCTA TTCACCGTTC TGGTTTGAAT 


TCGGAAGTTA 


TACAAATCAC 


4380 




GCAGGTGAAT 


TGGTTCGTGG 


TGACCAACGT 


ATTTGGATGG 


CACAATTGAA AGATGGCGTA 


4440 




CCATTTACTG 


CTGGTGCATT 


TACTACTGGT 


AAATATCCAT 


TTATGATGTT 


TGGTTTACCA 


4500 


10 


GCGGCGGCAT 


TTGCTATTTA 


TAAAAATGCA 


CGACCAGAAC 


GTAAAAAAGT 


CGTGGGTGGT 


4560 




TTAATGTTAT 


CAGCAGGATT 


AACTGCATTT 


TTAACTGGTA TCACTGAGCC ATTAGAATTT 


4620 




TCATTCTTAT 


TTGTAGCACC 


AGTACTTTAT 


GGAATTCACG 


TATTATTAGC 


TGGTACATCA 


4680 


15 


TTCTTAGTAA 


TGCATTTATT 


AGGCGTTAAA 


ATTGGTATGA 


CATTCTCAGG 


TGGTTTCATA 


4740 




GATTATATTT 


TATATGGTTT 


ATTAAACTGG 


GATCGTTCAC 


ACGCATTATT 


AGTTATTCCA 


4300 


20 


GTCGGTATTG 


TATATGCTAT 


CGTGTATTAC 


TTCTTATTCG 


ACTTTGCAAT 


TCGTAAGTTT 


4360 


AAATTGAAAA 


CACCAGGTCG 


TGAAGATGAA 


GAAACTGAAA 


TTCGTAACTC 


TAGTGTCGCA 


4920 




AAATTACCAT 


TTGATGTCTT 


AGATGCAATG 


GGTGGAAAAG 


AAAACATTAA 


ACATTTAGAT 


4930 


25 


GCATGTATTA CACGTCTACG 


CGTAGAAGTG 


GTTGATAAAT 


CAAAAGTAGA 


TGTAGCAGGT 


5040 


ATTAAAGCTT 


TAGG CGCATC 


AGGTGTATTA 


GAAGTTGGAA 


ACAATATGCA 


AGCTATCTTT 


5100 




GGTCCAAAAT 


CAGATCAAAT 


TAAACATGAT 


ATGGCCAAGA 


TTATGAGTGG 


TGAAATTACG 


5160 


30 


AAACCAAGTG 


AAACGACAGT 


GACTGAAGAA 


ATGTCAGATG 


AACCAGTTCA 


CGTAGAAGCA 


5220 




CTTGGAACAA 


CAGACATCTA 


TGCACCAGGT 


ATCGGTCAAA 


TCATTCCATT 


ATCAGAAGTA 


5280 




CCTGATCAAG 


TATTCGCTGG 


TAAAATGATG 


GGTGATGGTG 


TTGGCTTTAT 


CCCTGAAAAA 


5340 


Jj 


GGTGAAATTG 


TAGCACCGTT 


TGATGGTACA 


GTGAAAACAA 


TCTTCCCTAC 


GAAACATGCG 


5400 




ATAGGATTAG AATCTGAAAG TGGCGTCGAA GTACTTATTC ATATTGGTAT 

• 


CGATACAGTG 


5460 




AAACTGAATG 


GTGAAGGATT 


CGAAAGTCTG 


ATTAACGTTG 


ATGAAAAAGT 


AACACAAGGT 


5520 


40 


CAACCATTAA 


TGAAAGTGAA 


TTTAGCATAC 


TTGAAAGCAC 


ACGCACCAAG 


CATCGTTACA 


5530 




CCAATGATTA 


TTACAAATCT 


TGAAAATAAA 


GAACTTGTCA 


TTGAAGATGT 


ACAAGATGCT 


5640 




GATCCAGGTA 


AGCTAATTAT 


GACAGTCAAA 


TAATGATTAA 


AAATGAAACA 


GCATATCAAA 


5700 


45 


TGAATGAACT 


TTTAGTCATT 


CGTAGTGCGT 


ATGCGAAGTA 


GCGAGTTGAA 


AGAGAATACG 


5760 




TTACAAAAGG 


CAGTAGCTTA 


AAATGAAGCT 


ACTGCCTTTT 


TAGTGCGCAA 


TGATGTATAG 


5320 




CAGGTGTGTT 


GATGrTAATA 


AGTTAAATAT 


TAGTGTTAGA 


TATAGAAAAC 


ATTGCTTATG 


5880 


SO 


TTTTTGTCAC 


ATTTTAGAAA AATGCATCTT 


CGCGACTAGC 


CAAATTAATA GTCTCATTGA 


5940 
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AATAAATTAA CATGATTTTA AATCTATTTG 
AGAAGGTCTA TTAGTTGCAG AGAAAGAAAT 

5 GGGTGTCGTT AGTAATATCG TTTATATTAG 

TAATCAGCAC ATGAATTACT CAACAATGAT 
AACGGAAGCA GAGTATAAAG TACCTGTCAC 

10 CTTAGTTAAA GCAAGCAAGA TGAAATGGGT 

CGTGCATTGT ATTGGTACAC AGACAGGCGG 
CTCTGTGCCA CAAGTGTTTC AAGACATTTT 

15 ATAAAAAGTA AGAAGGTGTT CGAAATGGTT 

GAATTTATTC ATCAATATCC GTTAGCAGTT 
TGTCATGCCG TTTTACCACA AATTGAAGAC 

20 

GCTGTGATTA ATCAAAGTCA GGTGGAAGCT 
CCTGTGGATT TAATTTTTAT GAATGGAAAA 
ATGCAACGTT TTGAACATCA TCTTAAGCAA 

25 

GAGCATTAAT ATCGCAAATG ATT AG CATTG 
CCCAGTAAAT ATTGGTAGTA ATTAGAATCA 
ATCAAAGAGG AGTGACGACA AATGCGTAAA 

30 

GTTGCAGCAT ACGCACATAT TAGAATTAAA 
GAACAAGGTA TACGATTATC TAGAGCTAAG 

35 AAAGCATTAG AAAAAATGGC GCCACAGACA 

TTTAAGATGC CAGTAAAAGT GGATAAGCAC 
AAACAAGATA AGCATCAACG CGTTGTATTA 

40 CCACTCAAAA TTCATTTCGA ATTTATTGAT 

ATCATGCCAG TATATCCGAA GATTCCGCAT 
GAAAAGTTGT ACCATGATTT ATTGAATCAA 

45 GGTGACTCTG CGGGCGGTCA AATTGCTTTA 

ATTGTGCAAC CAGGACATAT TGTATTAATT 
CCTGAAATTC CTGACTACTT AAAGAAAGAC 

50 TTAGCTGAAC AATGGGCAGG GGACACACCT 
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TAAGATAAGG AGATTTGTCA TTATGACAAC 6060 

CGAAGTGAAT GGTTACGACA TTGATGCGAT 6120 

ATGGTTCGAA GATTTGAGAA CAGCGTTTAT 6180 

CAATCAAGGC ATTTCACCTA TACTTATGAA 6240 

AATACATGAC AAACCAGTAG GTCGTATTTA 6300 

GTTTCAGTTT GAAATTGTGT CCGCACATGG 6360 

TTTTTACAGA TTGAGTGATA AGAAGATAAC 6420 

AGCAACAAAA TAATGACTTC ATTTTAAAAT 64 8 0 

AAGCAATTAA ATAGTGTCGA AGCATTCCGT 6540 

GTACATGTCA TGCGCGATCA GTGTAGCGTG 6600 

TTGATGCAAT CATATCCCAA TGTGCCATTA 6660 

ATTGCTGGAG AATTAAATAT TTTCaCTGTA 6720 

GAAATGCATC GTCAAGGGCG TTTTATCGAT 6780 

ATGAATGATA GTGTAAATAA CGATGTCGAT 6840 

CTAAGATTAT GTAGACATCA TAACTTATTT 6900 

GCATGGTACA GTAGAACTAT AGTAGAAATC 6960 

AAATGGTCTA CACTTGCGTT TGGATTTTTA 7020 

G AAAAAC G CA GTGTGAAAAG TTATATGTTA 7080 

CGTCGTTTTA TGTATAAAGA AGAAGCGATG 714 0 

GCAGGCGAAT ATGAGGGAAC CAATTATCAG 7200 

TTCGGTTCAA CCGTTTATAC CGTTAACGAT 7260 

TATGCACATG GAGGCGCATG GTTCCAAGAC 7320 

GAACTTGCAG AAACACTCAA TGCTAAAGTC 73 80 

CAAGATTATC AAGCGACGTA TGTGCTTTTT 7440 

GTAGCAGATT CTAAACAAAT CGTTGTAATG 7500 

TCATTTGCTC AATTGTTAAA AGAAAAACAT 7560 

TCACCAGTTT TAGATGCAAC GATGCAGCAT 7620 

CCAATGGTAG GTGTGGATGG CaGTGTGTTC 7680 

TTAGATAACT ACAAAGTATC ACCAATTAAT 7740 
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10 



15 



20 



25 



30 



45 



CCAGATGCTT TGAACTTATC GCAATTGTTG AGTGCGAAAG GTATCGAACA TGACTTTATA 7360 
CCTGGATATT ACCAATTCCA TATTTATCCA GTATTTCCGA 7900 
(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

GTCTAAATAA ACAAAATTAT CATTGATTaC TGAACTGGCA TTTCGAAGTA ATGCTTCAAT 60 

ATCATTCGAA TATTTCTTCA ATTTATGATT GTGAAATAAT TCTTGCATCA AAAATGGTCT 120 

TTGGTCACAT GAATGTGCAT CTGAAGCTAC AAAATGAGCC AAATTACATT CTATAAATTG 180 

TAATGATAAC TTTTGAATGT TTTTACCAAA TCCACCAACT AAAGAACTCG ATGTTAATTG 24 0 

ACTCAGTGCC CCATTTGCAA CCAATTCATA TAATATTTCC GGATTTTTGG CGATACTTCT 300 

ATTTCTTTCA GGATGTGCAA TGATTGGTAT GTAACCTCTC GATTGTATTT CAAAAAACAA 3 60 

TTGTTTTGTA TAATGTGGTA CTTCGCCCGT TGGAAATTCA ATTAATAAAT ATTTCGAACG 4 20 

ATTAATACCT TGAATACTAC CATTATCTAA GCCTTTCAGA ATCGAATCTG TAATTCTAAT 4 80 

TTCTTGCCCG GGAAATAATT TAATATCCAA TGCTTGAACT TCTGGATGCG TTCTTAACTC 54 0 

CGC CAATTTC ACAAGCACTT GTTGAAATGT ATTATCATAT CTCGGATGCA AATGATGAGG 600 

TGTCGCTACA ATACTTGTTA CACCTTCATC CTTAGCTTGC TTTAATAGTG CAATACTCTT 660 

35 TTCAATTGTT TTAGGACCAT CATCTATATC AACTAATATA TGGTTATGAA TATCAATCAT 720 

GATTCATCAG TCCCATAATA TGCATAGTAA CTAGCACTTT TATCTTTAGG CATTCTATTT 7 30 

AAGACTACAC CTAATAATTT AGCACCTGTT GCTTCAATAA GTTCTTTTCC TTTTTTAACT 84 0 

40 TCATCTCTAT TATTATTTTC CGAATTAACT ACGTAGACAA CATTGCCGGT AAACTTTGAA 900 

AATAATTGCG CATCTGTAAC TGTGTTCACT GGTGGCGTAT CGATAATTAC AAAGTTATAA 960 

TTCATCAATA ATGTGTCATA CAAATTTGCA AATGCCCTTG ATGTAATTAA CTCTGACGGA 1020 

TTCGGTGGGA TTGGCCCAGA CGTCAAGACG TCTAAATCTT GAATTTCAGT TGAGATAATA 1030 

CTGTCTTGAT AAGTTGACCA ATTTAGCAAT AAACTTGATA GGCCTTCATT GTTTGGCAAA 1140 

TTAAAAATAT AATGCTGCGT AGGTTTACGC ATATCCCCGT CTACGATTAG TGTTTTATAA 1200 

CCTGCTTGCG CATATG CAAC TGCTAAATTT GCTGCAATTG TAGACTTACC TGCGCCTGGT 1260 
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GATCTTATGC CTCGAAATTT CTCGCTAATA GGTGACTTTG GTTGTTCATG GACAATTAAA 13 80 

CTTGATGTAC TTCyTCGTGT ATTCGTCATG GTAATTCCTC GTAAATTAAA ATTTTTGTAT 1440 

TGAACCTAAA ATAGGTAATC CTAGTTGCGA TTCAACATCT TCTTCTGTCT TAATACGCTT 1500 

ATCTAATAAT TCTTTTAAGA AAATAATCAA TATTGCTAAA ACAATACCAA CAATAATGCT 1560 

GATAACTAAG TTGACAGATA CTATTGGAGA TACTTTTACA GCATTATCAT GTGCTGAGGA 1620 

AAGTATCGTA ACATTATCAA CACTCATAAT TTTAGGCATG TCATGAGCAA AAACTTTAGA 1680 

TATTTTATTA ACAATTTTGT CAGATTCAGA TTTATTCCCA GTGGTAACTG ATACAGTAAT 1740 

AATTTGAGAG TTTGTTTGAT TGGTTACTTT TAAAAATGAA TTCAACTCAG CTGTTGAATA 1300 

CTGACCATCA AnTTCTCTAG ATACTTTATC TAGAATTCTA GGACTTTTGA TAATTTCCGT 1860 

ATATGTATTA ACAGACTGCA AACTACTTTG AACATTTTGG AAAGCTAAAT CACTTGAGGA 1920 

CTTTTTCATG TTCACTAATA TTTGAGTAGA AGCAGTATAT TTGTCAGGCA TAACAAAAAA 1980 

GGTT 1984 
(2) INFORMATION FOR SEQ ID NO: 14 0: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 72 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

CAAATCCCTT GGTGATGAtA AAtGtATTGC TGTGTAGCCA AATAATCTTC GTATATATGA 60 

35 CTGACGTTCA ACAACAGCTT GCAATCGTTT CGTTGGTACA GTTACTTTCT TCTTGTTAAA 120 

GAGACCATAT TCAATTTTAA GTTGCTCATT TTCAAG CATC AC CGAAAAGC CATAAAATCT 180 

TATCATTGTT ATAATCGTTC CAATAATATA TGCCACTATT AATACTAGTA AAATGATGAT 24 0 

40 TAATACTGAA ATACTTACAA TTTGAACCCA TTGACTAATT TCATGATTTA GCTTCGACCA 300 

TGGGATCAAC . TCTCTTACAG CCCCGTAAAT CGGTACTAAA GCTGCTAACG TTACACCAAT 360 

GGCGCCACTG GTCATTGCCA TAAATAGTGA TTCTTTAAAA TTCATCTGAT ATATAGGAAT 420 

GCGTTTATTT TTCTGATTAA G CAT ACT AT C AGTGTTCTGC ACTTCATCTA AGCGACCTTC 480 

TGCGATGTCT TCCACATTAC CTTCAATGTC ATGATTACAG TTGTCATTCT TCTCAGCACT 540 

AGACTTTTGC GCCACTTCTG TCTTCAACTC TGTTTGCAAT TGATCAATAT ATCGTTCAAG 600 

ATATTCACCT TGTTTTTTCG AAATAACACT TAAGACAATA CCATCACTTG GTGTTTTGAT 660 
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AATACGTTTT ATATTTAATT CTTTACGCTT TTTATTAAAA ATACCTGTTG TTAAAATGAA 780 

ATAATTATCC tCAATCCAAT ATCGCGTGTT CATAATTCCG ACAATTTGAG AAATGTATGA 840 

5 

TATTAAAAAG AATACAAATA CAATACCTAT CCATAAATAT GATTCGGGAT TCGTATAATC 900 

AAAATCTTTC AATTGAAAGA TAATGAAAAT AAAAAAGACG ACTATGTTTT GTTTGATAGC 960 

ATTGATTATG CCATT AAAAT ATGAAATCGG ATGTAATTTT TGAGGTTCAG ACATCACTTT 1020 

10 

CAACCCCTCT CAAATTCGAC ATAGTTCTCT CTTCGATTAT TTTAACATCG TCATGAGACA 1080 

TCATCGGTAA ATAAATAGTA TGACCTGCAG TCATAAATCC AACTTTATAC AAATTAAGCA 1140 

15 CTTTACTAAT TGGATTAGAT TTAATCGACA AGTATTGTAA ACGTTCAATT CGACTCGTTT 1200 

CTTCTTTATA TATAAAAAAT GATGTACGAT ATTGTACACT TAGTTGATCA ACTTTATAAA 1260 

AGCGACAATG ATATTGCCA7 AAAGGCTTAA TAAATAATTT TAATGTACTC AGAGCACCTA 1320 

20 AAACCAACAA AATATAAAGT AAGTAATGTG GCCATTCAAA 7CTTAACCAT ATAAAATAAA 13 3 0 

AAATGACATA CACAGCTACA CTCAATATAA AT7CTAAGCC ATT CG7 AATG TAG T AA7 ACA 144 0 

ACAATGCTGA CTTAGGACTC TTAGTCAACT TAGTATAATC TGACATATAC CCCTCTCCCC 1500 

25 

AAATAAAAAA TTATACGGAT TTATAATCTA TTTCATTTTA TTTTTATATG ATGATAATTA 1560 

TAGCATATGG AATATTTCAT GCTAATTTAT TCTTCCTAAA GGTACATCTA AAAATTTAAT 1620 

TAAGCAGAAA GTGCTTGAAT TGCTAAAAAG ACACCATGTT ATAATTTTAT CAACATGATG 16 80 

30 

CCTTTCATCT ATAATCAATC TTTCATCTTA TCAAGAG CGA TATTTAGTTC AAGCACATTC 1740 

ACATAATCAT TTGTTAACAC ACCACGCTGC TTACGATGTT GAATCAAGTC GGCCACTCTT 1800 

35 GAAGTAGATA CATGACGAGC ATCAGCAATA CGAGGTGCTT GCTTCAATGC ATTTTCGACC 13 50 

GTAATATGCG GATCTAAGCC CGACCCAGAA CTTGTTGCAG CATCTATTGT TACATTTGAA 1920 

TTCCCAAATT TAACATGATG TTTCATGCGT GCTATTAATT CGGTGTTTCC ATTCGATTCA 13 3 0 

m 

40 TTACTTCCAC CTGAAGATAC GCCGTTTTTA TATAATTTTT CAGGATTCAT ATTATAATCA 2040 

ACTGCACTCG GTCTCCCGTG AAAATATCGT GTCTCTGTCC AGTGCTGTCC AATCAATTTT 2100 

GATCCAACTA TACGATTGTC ATACGTAATT AAACTGCCAT TTGCTTGTTG ATAAAAAAAT 2160 

45 

ATTTGACCAA TTAACGTGAT AGCTAACGGG AATAAAAATC CACATAATAC CATAGTTATT 2220 

ATCGTTAAAC AAATACTATT TCTTATCGTA TTCATGGTAC AGGCTCCTTC CTCTTTACAC 2230 

AAAAAATTGT ACAATCATAT CTATTAATTT AATGCCTAAA AACGGGACGA TTAATCCACC 2340 

50 

TAATCCATAA ATCAACATAT TATTTATAAA GATTCTATCA ATGCTGTAAC CCTTTACTTT 2400 

TACACCTTTC ATGGCAATTG GAATTAAGGC AACAATGATT AATGCATTGA ATATCAAAGC 2460 
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TTTATTTTAA AGTTAAAAAT TCACCAATAG 
AACCACTTAG TAAAACGATA AATACGATTA 
CTATTGTATA TTTATCTTGA TGGTATGATT 
ATTGCAAAAT AATTGGTATA TAACGAGAAA 
AGAATGTTGT ATCATCTTTC AGTCCTTCAA 
TCATTTCATA CATAACTTGT GAAATACCAT 
CTCCAGGAAT CATAAAAGCA AGTGCTGAAA 
AGACTAAGAC AATACATTTC ATTTCACGGG 
TTTTACCAAC CATCAAACTG CATATAAACA 
TGAGTCCTAC GCCTTCGCCA CCAAATACAA 
ATCCACCTAT AGGCGTTAAG CTATCATGCA 
TCGTAATAAC TGTAAATAGT CCTGACAAAC 
TATTCGGTCC ATAAATGCCT AAATTCGCTA 
TAGTTAATGT AAGAATTGCT ATAAAAATGA 
GACGATGTAC TCGTTTACCA TGTCTACTTA 
TAGGAAGTAA CATCATACTG CCCATTTCTA 
AAGGTGTTGC AGAATTTCCT GCTAAAAATC 
ATTCAAGTGA TGCAATAGGT CCAAATGCAA 
TCATTAAATT AGCATGCAAC GTTTGTGGTA 
AACATGATAA TGGTAAAAGT ACTCGGACAA 
CAATGATATT AGTTAATCCA GTTAAACGTC 
ATGCACTAGA TGTAAACATT AAATATGTCA 
CTGaTTCACC GTTATAGTGT TGtAAATTAC 
ACGCTAAATC TATCGATTGG TTTAAATTAT 
CTATTAGCAA TACAAATGTT ATAAACCCCA 
CATATGTTTT AGCTGACATG TGTTCTAAAT 
CAAATCTAGT AAATATTAAA TCTACTCTTG 
ATCCACTAAA AACATACGTA ATCATAACCA 
TAACCCTCAC TTAATATATT TCTAAAATTT 



GACCAAGTAA TAGTACTGGA ATAAATGTCA 
GTGATACGCC AAAATAAGGT TTATCAATCG 
TTTTATTCAC TAAACTTGAT GCAATCATTA 
GCAACATAAT GATTCCTGTA GAGATATTCC 
ACCCTGATCC ATTGTTCGCA GCAGCTGATG 
GAAAAGACGG ATTCGTtATa CTTtCACTTG 
ATACTAAAAT TAAAATTGGG TGTATGAGAA 
CGCCAATTGG CATATTTAAA TATTCTGGTG 
CCGTCAGTAA GACAAATATC AATAAATTCA 
CATTTAGCAT CATTAATACC ATTGGTCCTA 
TGTTATTAAC AGAACCCGTT GTAAATG CCG 
CTGCTCCAAA CCGTACCTCT TTACCTTCCA 
GTATTGGATT ACCACGATAC TCACTCCACA 
AAAACATTGC GACAAATAAT ATCAACGCAT 
ACATGCGACC AAATAAGAAC AACATTGACA 
TAAAATTGCT CCAAATATTT GGATTTTCAA 
CTCCACCATT CGTAC CAAGA TGTTTTATTG 
TATGTTGAAT ATGTCCGCTT AAAGTCCGAA 
CaCCTTGAGT CATCAATAAA ATACTAATTA 
TAAACCGAAC AATATCTTGA TAAAAATTAC 
TCAACATCGC TATACAAACG GCGTAACCTG 
TTACAATCAT TTGCGTTAAA TATGTCACAT 
TATTTGTTAA AAAAGATATT GCTGTATTAA 
GATTTGGATT TAAAAAAAGC CATTG CTGAA 
TAAATCCATT AAATGCCAGA AAATGTTTGA 
CTGTGCCGAT AATTTTAAAA CACATATTTT 
ACGATTGCAC CAATGCTACG CGATATAGAT 
TCATTGTTAG AAACAAAATT ATTTCCATGA 
TTCACTACGA ATTAAGGCAT AAAATAAATA 
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ACACAACAAC ATCGTAACAA CTTGTTTATG AGAGAAATnT TAATTTTCAA ACTTAGTTAT 6180 

TAAGAAAnCA TTAAGATGTG TATGCAGAAA TAAATTTTAT AGCATTTAAT TGTGAAGAAT 6240 

5 ATTATGATAT TGCTATCGAG GTGAAGGTTA TG 6272 

(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS : 
io (A) LENGTH: 1978 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

75 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

AAATGATGTT TTACAATAAA TATAnAAACG TATCAACATA TATCATCATA TTTTTAGTTT 60 

20 CAAGTGCAGC CTTTGCAATA TTCTTGTTAA GTGCGnACAT TAGTGCTCAC TCGGAACAAG 120 

TGTACGAAAT GACTGACCAT CAAATTAAGA ACAATACGAT AAATAAAGCA TACGAACATA 180 

AAGACCCTAC AAACAATAGC GAACAAAGAG ATGGGAAAGT GTTCGCTTTA ATAAATTGAT 24 0 

25 ACATTGTCAC AACGTTATTT TGCCTATTTT TGCGmAATAG CGTTTTTTAT TACwTTTTTG 300 

CTGATsTTAA ATTTGTTATA TTTTGTTAAA GTATTATAAT GATTGAATAA ACAAATTGAA 3 60 

GGTAGGTTTT TTAATTGAGT AATTCTGATT TGAATATCGA AAGAATTAAC GAGTTAGCTA 420 

AAAAGAAAAA AGAAGTAGGA TTAACTCAAG AAGAAGCAAA GGAGCAAACA GCCTTAAGaA 4 80 

AAGCTTATCT TGAGAGTTTT AGAAAAGGGT TTAAACAACA AATTGaAAAT ACTAAAGTAA 54 0 
TTGATCCAGr AGGTAATGAT GTAACACCTG AAAAAATTAA AGAGATACAA CAAAAAAGAG ' 6 00 

ATAATAAAAA TTAAATCACA AATCTGTAAA GAATTTTCTG ACATTATAAC TTGAAATAAG 660 

TATtTTACTT ATCTTTTTAT TTTAAAATAA GTTATAATGT ATTTGATAAA ATTGAAGAAG 720 

40 GGAAGATACA CAAGATGTTT AATGAAAAAG ATCAATTAGC TGTTGATACG CTACGTGCAC 7 80 

TAAGTATCGA CACAATCGAA AAAGCGAATT CTGGTCATCC AGGATTACCT ATGGGAGCTG 840 

CCCCAATGGC TTACACTTTG TGGACACGTC ATCTGAATTT TAATCCACAA TCTAAAGATT 900 

45 ACTTCAATAG AGACCGTTTC GTATTATCTG CAGGGCATGG TTCAGCATTA TTGTATAGCT 960 
TGTTACATGT TTCTGGTAGT TTAGAATTAG AAGAATTAAA GCAATTTAGA CAATGGGGTT - 1020 

CTAAAACACC AGGTCATCCT GAATACAGAC ATACAGATGG TGTAGAAGTT ACTACCGGAC 1080 

50 

CACTTGGACA AGGTTTTGCT ATGTCAGTAG GATTAGCTTT ACAGAAGATC ACCTAGCAGG 1140 

gAAATTTAAT AAAGAAGGAT ATAATGTTGT AGATCATTAC ACATATGTAT TAGCTtCTGA 1200 
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AAGTAAATTA GTTGTTTTAT ACGATTCAAA TGATATTTCA TTAGATGGCG AATTAAACAA 1320 

AGCTTTTTCT GAAAACACAA AAGCTCGTTT TGAAGCATAT GGTTGGAATT ACTTACTAGT 1380 

TAAAGATGGT AATGATTTAG AAGAAATTGA TAAAGCGATT ACTACAGCTA AATCTCAAGA 1440 

AGGACCAACG ATTATTGAAG TTAAAACAAC AATCGGATTT GGTTCACCGA ATAAAGCAGG 1500 

AACTAATGGT GTTCATGGGG CACCTTTAGG TGAAGTTGAA AGAAAATTAA CATTCGAAAA 1560 

TTACGGTTTA GATCCTGAAA AACGTTTTAA TGTTTCAGAA GAGGTATACG AAATTTTCCA 1520 

AAATACTATG TTAAAACGTG CTAATGAAGA TGAATCTCAA TGGAATTCAT TATTAGAAAA 1680 

ATATGCAGAA ACATATCCTG AATTAGCAGA AGAATTTAAA TTAGCGATTA GTGGTAAATT 174 0 

GCCTAAAAAT TATAAGGATG AATTACCACG TTTTGAACTG GGTCATAATG GTGCATCTCG 130 0 

TGCTGATTCT GGTACTGTTA TTCAAGCAAT CAGTAAAACT GTCCCTTCAT TCTTTGGTGG 136 0 

2Q ATCAGCAGAC CTTGCTGGTT CAAACAAATC CAATGTAAAT GATGCAACTG ATTATAGTTC 192 0 

TGAAACACCT GAAGGtAAAA ATGTGTGGTT TGGTGTACGT GAATTTGCTA TGGGTGCT 197 3 
(2) INFORMATION FOR SEQ ID NO: 142: 

25 . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 7588 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

TAGTAGTATT TATTAAATTA TACGAAGGGA CCcAACACAG AAAATTCATT TTATTGAATT 6 0 

TTACATTTAT GTGCCAAGTT GGGAAAAATG TCTTATTTTT TCaAAGTATT TAAAAGTAAA 12 0 

ATTACATGTT AATACGTAGT ATTAATGGCG AGACTCCTGA GGG AG CAGTG CCAGTCGAAG 180 

■ 

ACCGAGGCTG AGACGGCACC CTAGGAAAGC GAAGCCATTC AATACGAAGT ATTGTATAAA 24 0 

TAGAGAACAG CAGTAAGATA TTTTCTAATT GAAAATTATC TTACTGCTGT TTTTTAGGGA 300 

TTTATGTCCC AACCTTTTTA GAATATTAAA TTTCTACAAT TTCGTCATCT TCAACAATAA 360 

45 AGCCCATTGT ATTGACGCTG TTATTTAAGA AAGTCAGAAT ATAACGCATT ACTTCATCAC 420 

GTTCTGGCTC ATTGTGAACC TCGTGGTAAA AACCTTGCCA AGCTTTAAAA TATAATTCAG 480 

GTGTTTGATA TTTTTCTTTA AACTCATCAA TTGCCCTAGT ATCAACAATT AAATCCTTCG 540 

TTCCATACAT TAATAGCGTT GGCATTGGTT GAATGTCATG AATATGAGCC ATCGTATCTT 60 0 

TCATCGTCTC ATTAATTGTA TTATACCAAT GATACGTTGC TTTTTTTAAC ATTAAACCAT 660 
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CATTAAAACG TGTGTCTTTT GAAATTTTAC CTATATTTGA AACAAGTTTA TCTTTACGAT 780 

TTTTTCCATT CTTTTGAAGT TCTAGCATAG GAGAAATTAA CATCATCCCC TCGATTGGCA 840 

ATTCTACTTT TTCAAGTAAA TTTAATAAAA TCAAACCGCC AAGTCCTACC CCTAATACAT 900 

AAGTAGGAAT TTTATATTCA TTAGCTATCT TTAACCAGTC TAGCAAACTT TCGTGATACG 950 

TTTGAAAGTT TTCAATTTGT CCTTTATTAG CTCTTGAAGT TTGACCTTGA CCAGGCAAAT 1020 

CTCCCATAAT CACATGATAG CCATTTCTTC TTAACATCGT AATAACATAT GCATATCTTC 10 BO 

CCGTATGTTC TAATATATTA TGAGCAATAA CAACGACGCC TTTCGCATGA TTTTCAGCTT 1140 

CCCACTTCCA CATTATTATA CTGCCCCTTT TTCATTAATC TTCAATAACA TAATTATAGC 1200 

AAATTCACTA TGTAGATTTC TATTTATAGT ATTATTGTTG TCCATATTAT TATATATAAA 1260 

TGAAATCAAC ATCAATAATA GTGTAATTAT ACATAATTAT TTTTGATTGT TTTTGATGAA 1320 

AACGCTTTCT CGAATATTTT TTTCATGCTA AACTTATTGT AAACACAAGG GTTTGGAGGA 1380 

GTAGCAATGG CACTATTAAA GAATTTTTTT ATCGGATTAT CTAATAATAG TTTTTTAAAC 1440 

AACGCAGCAA AAAAAGTGGG CCCACGTTTG GGCGCCAATA AAGTCGTTGC CGGAAATACA 1500 

25 ATTCCAGAGT TAATTAATAC AATCGAATAC TTAAATGACA AGAATATCGC TGTTACGGTA 1560 

GACAATTTAG GGGAATTTGT CGGTACAGTT GAAGAAAGTA ATCATGCTAA AGAACAAATT IS 20 

TTAACAATTA TGGACGCGCT TCATCAACAT GGCGTAAAGG CACATATGTC TGTTAAATTG 1680 

AGTCAGTTAG GTGCAGAATT CGACTTAGAA TTAGCTTACC AAAATTTAAG AGAGATTTTA 174 0 

CTTAAAGCAA ATACTTACAA CAATATGCAT ATAAATATTG ATACTGAAAA ATATGCTAGC 1800 

CTGCAACAAA TTGTTCAAGT TTTAGATCGC TTAAAAGGCG AATTTAGAAA TGTTGGTACT 1860 

GTAATTCAAG CATATTTATA CGATAGCCAC GAATTAGTTG ATAAGTACCA AGATTTACGA 1920 

TTACGTTTGG TTAAAGGTGC ATATAAAGAA AACGAATCAA TTGCATTTCA ATCTAAGGAA 1980 

GACGTAGATG CAAATTACAT CAAAATAATT GAACAACGTT TGTTAAACGC ACGCAATTTC 2040 

ACTTCAATTG CAACACATGA CCATCGCATC ATTAATCATG TAAAACAATT TATGAAAGAA 2100 

AATCACATTG AAAAAGATCG TATGGAATTC CAAATGCTCT ATGGTTTTAG ATCAGAG7TA 2160 

45 GCAGAAGAAA TCGCAAATGA AGGCTATAAT TTCACTATTT ATGTACCTTA TGGCGATGAT 2220 

TGGTTTGCGT ATTTTATGAG AAGATTAGCA GAACGCCCAC AAAAC CTATC TCTTGCTGTA 2280 

AAAGAATTTG TGAAACCTGC TGGCTTAAAA CGTGTTGGCA TAATTGCAGC TTTAGGAGCT 2340 

ACAGTTATGT TAGGTTTAAG TACAATTAAA AAATTATGCC GTAAATAGAG CAAGACA7AA 240 G 

ACAATAATTT AGGAGTCTGG AACAATAATC AATGTTCTAG GCTCCTAAAT GTTATATTGG 2460 

55 



30 



35 



40 



SO 



741 



10 



15 



20 



EP0 786 519 A2 

TAGATTTTAA TAAATTAGCC ATTTCAATTG CACTTACTGC TGCTTCAGCA CCTTTATTGC 2580 

CAGCTTTCGT ACCTGCTCTT TCCACAGCTT GTTCAATAcT TTCAGTCGTT AAAATACCAA 2640 

ATATGACTGG TACATTAGTT TGATCATTCA CTTTAGAAAC ACCTTTCGCG ACTTCATTAC 2700 

AAACATAATC ATAATGAGAC GTAGCACCGC GAATTACGCA TCCTAATGTA ATTACTGCAT 2760 

CATAATTTCC TGATGAGGCT AATTTTTTAG CTACTAAAGG AATTTCAAAC GCACCTGGCA 2820 

CAAATGCTAC ATCAATATTG TCTTCATTAA CATCATGTCG AATCAAAGTA TCTTTTGCAC 288 0 

CTTCAAGTAA TCTTCCAGTG ATAAAATCAT TAAATCGACT AACTACGATT GCAACTTTCA 2940 

AATCTTTTCC AATTAATTTA CCTTCAAAAT TCATGTTAAA ATCCTCCTAT ATTAAATGAC 3000 

CCATTTTTAT TTTTTTCGTT TCCATATAAT CATGATTATG TACCGTTTCT GGTACGATAA 3060 

CTTCAATTCT TTCTGCAATA TCAATGCCAT ATTGTTTTAA TCCCTCAAAT TTACTTGGAT 3120 

TATTACTTAA TAAATTGATA TGTTCGATGT TAAAATATTT TAAAATCTGT GCAGCAATAT 3180 

GATAATCTCG CAAATCTTCA TCAAAACCTA ATGCTAAATT TGCAGTTACT GTATCATATC 3240 

CTTGCTCAAT TAATTCATAT GCGCGTAATT TGTTTAACAA TCCTATGCCA CGACCTTCTT 33 00 

25 GAGGTAGATA AATAATCATG CCACCATGTT CATTGATATA CTTCATAGAC GATTCAAGTT 3360 

GAGCAC CACA ATCACAACGT TGACTATGGA AAATATCGCC TGTAAGgCAC GCAGAATGTA 3420 

AGCGTACATT TTCATGTTGT CGAATTGCAC CTTTTGTCAG TACAACTATC TCTTCATCTG 34 80 

TGTATGTCGC TTTAAAACCA TACATATCAA ATGTTCCGAA ATCTGTAGGC ATTTTCACTT 3540 

TTGCCTTAAA TTCAATTTCT GGTTCTAATT TTTTACGATA TTCAATTAAA TCATCAATCG 3600 

TAATCATCTT TAATTGATGT TTTTCTTTAA ACTTTTGTAA ATCTTGTCCT TTCGCCATCG 3660 

TGCCGTCATC ATTCATAATC TCACAAATGA CACCAGCGGG CTTGGCACCA GTAAGTTTAG 3720 

CTAAATCAAC AGCCGCTTCT GTGTGTCCAT TTCTAGCTAA TACGCCTTTA TCTTGTGCTA 3780 

CTAATGGAAA TAAATGACCA GGACGATTAA AATCTTTAGC TTCACTACTA GGATCAATGA 3840 

GCTTTTTGGC AGTCAATGTA CGTTCATAAG CACTAATTCC TGTTGTTGTA TCTACATGAT 3900 

CAATACTCAC TGTAAATTGC GTACCAAAGA TGTCGGAGTT ATCATCAACC ATTTGTACCA 3960 

45 AATCCAAACG TTGTGCAATA TCTTTAGACA CTGGTGCGCA TATTAATCCC C-TGCTTCTT 4020 

TCGCCATAAA ATTAATGGTA TTATCGTTCA TCCATTCAGT AACCGCTACT AAATCACCTT 4080 

CATTTTCACG ATTCTCATCA TCTACTACAA TAATTGGTTC TCCATTTTTT AAAGCCATTA 4140 

AAGCACTGTC AATATTATCG AATTGCATGC TACCCCTCCt AAAAACCAAA TGCTCTTAAT 4200 

TTATCTACAG ATAATTGGTC TTTATCTTTA TTTAAAATAT TTTCAACATA TTTAAACAAA 4260 
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CTCGTTTCTG GAATAAGATG AATGTCAAAA CTGTTATCAT GCTTATCAAA TACCGTTAGA 4380 

CTAACACCAT CCACAGTAAT AGACCCTTGC TTAACTAACT GATTATTAAT ATGTTGGCTA 444 0 

CATTGAATCG TAATAATTTT TGCATTGGCT GTTTCATTTA TTTTTGAAAC TGTTCCTAGT .4500 

TCATCTACAT GACCGAGGAC AAAATGTCCA CCAAACCTAC CGTTACCACT CATGGCACGC 4560 

TCTAAATTTA CTTCTGATTG TCGCTTAACA TCTGCTAAAT AGGTTTTATT TTCAGTGCCT 4 620 

TTAATTACTT GAACAGTAAA AGATGTCTGA TTAAAATCAA TCACTGTTAA ACATGCACCA 4630 

TTAACACTGA TGGAATCACC AATATGCATA TCTGCCGTAA TCTTATGTGC TTCAATTTCA 4740 

ATCGTCCTGA CTGATTGACG AATTTGAACA CTTTTAACGA CACCTATTTC TTCAACGATG 4800 

CCAGTAAACA TGCATCATCA CTTCTTTCGT AAAGTTAATT TAACATTTTG ATTTAATAAC 4 860 

TCGGAATGAA CAATTTCAAA TTGGTTCGCA TCTGGTATCT CAATCACATC ATTTGTTTGA 4920 

20 TAAAATTGAT AATTTCCAGA TCCGCCAATT AATTTCGGGG CATAATAGAG AATAAATTCA 4980 

TCTATATAAT TAGATTGGAG AAATTCTGAA GTAGTGGTTG GACCTGCCTC GACTAGCAAA 5040 

GTTCCAACTC CTCTTTTATA TAAATTGTGA AGAATTGTTG TTAAATCGCA AGACTTCAAG 5100 

25 TAAATAATTT CAATATGTGT TTGATTGGTT GTTAAATTTG GATTTTCAGT ATATATCCAA 5160 

ATTGGTGTTG ATTCATCTTG ATAAATTTGC TGATTAAAAT GAATATTCCC AGACTTAGAC 5220 

AATATTACTT TTATAGGGTT TTTTCCATCT TGAATACGTG TAGTATATTG TGGATCATCT 5280 

AATTCAACTG TACGTCTTCC AGTTAACACT GCGTCGTGTC GATGTCTTAA CTTATAGACA 534 0 

TCTTGTTTAA CCTCTTTGTT AGTAATCCAT TGACTTTGTC CATTATCATT CGCTTGTTTA 5400 

CCATCTAAAC TTGCAGATAC TTTCAC7GTA ATTTGTGGCA GTTGCTTTGC TTTTGCTTTA 5460 

AAAAAGTCTT GGTATAATTG TGATGCCCGT TCATCATCAA CGCATTCAAC CTCAATACCG 5520 

TGAGCCCGTA ACGTCTCATC ACCATGTGTG TCTAACGAAT TGTCTTTTGT TGCGTATACT 5580 

ACTTTTGCTA TCTTACAATC AATTATTTTG TTAACACAGG GTGGTGTTGA ACCAAAATGA 5640 

CTACATGGCT CTAACGTAAT ATAAATCGTC GCACCTTCAG CATTTTGTTG TGCCATATCA 57 00 

AGTGCTTGAA CCTCCGCATG CTTG7CACCT TTTCTCAAGT GTGCACCAAT ACCAACAATC 5760 

45 CTACCTTCTT TAACTACAAC AGCGCCAACG GGTGGATTAA CACCTGTTTG ACCTTGTACC 5820 

ATATTTGCAA GTTGAATCGC ATAATCCATA AATTGACTCA AATGATCACC TCTATAAACA 5880 

AAAATCCTCA CATCATGAAT TAAGATGCAA GGAGaAAAAT TTATCGTTAA ATAAGCCTAT 594 0 

TTGTACACAT TTTTACAAAT ACGCTACATT ATCTTTGTCG ATAATTAACA TTCTTTCTCC 6000 

CATCCAGACT TTAACTGTCG GCTCTAGAAT CTCACTAGAT CAGCCACTAA TATGAAACAT 6060 
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TTaTATATGA AATTGTTATA GATTATTTGA GTACGTAGTA TGTCAACTAC ATTTAAAATG 
ATACTATATG TTTTCTGAAA AAACAATTAA TGACGGTTTT AATTTAATAT AATCTGAGTA 
CTATAGGCAT CTCATTGATA TGATTCTTAC TAACAGACAT TAAAATCAAA CCTTCAATTC 
GTCTCTATAG AGCGTTCTCT TTATTATCTT CTAGTTACAA ATTATTGATT GtCACtGCGC 
TGTTGTTGCT CATTCGATTC TAAAGCATCA TATAATTGAG ATACTGTATG CGCAACTTGT 
TCTACAATCA TTTTCACACC GTTTCGTAGT TTATTAACAC CGTTTGTCAT TTGACCTATC 
GCAATCATAT TTGTTAATGT TCCAAACCTT GGACTAATAA CTTGATTGGT TTCCGGAATG 
ATTTGTATGC CTCCCATTGG GTGTGCTTGT ACAATTTGTC TATTTTCAAG ATTTCTAATT 
AATTGATCAT CTTGATCCAA TTCATTTAAA TGACTTTTTG CACCTGTCGC GTTAATGACA 
ACATTATATA TGTCTACTGA TTCTTGGTTT TTGTATGAAA AATAATACAA CTTGCCATaC 
A7GTTCACAT CTTCTAAATC TTTTTTCAAA ATTAAAGACT TATTTTCTAT TAATTCAATA 
ATTAGTTCAG CAGTTCTTGG AGGCATTGGA TTTGAATTTA ATTGAATCAT CTTTGAGTAT 
TTTTGATTAA ATTGATGTTG GTCTTCAATA CTTAAGCTAT TCCATATCCA ATTTAAATTC 
TCTTTCAAAT GTTCAATCAT ACTTTGGAAA ATGCCCaTTT CTGTTGGACG CGCTAAATCA 
TACTTCAAAT CTGCAATATG ATTTC CTGTA CGTCTATGTA CTAATTTTTT AAAATCAATG 
TCATATTCAG CACATTCTTT TAAAAATAAA GAAACTAAAG TATCAAGCGG TGCATTGCCG 
AAATGATGTT TTTTAATGTC ATTTAATTTG TCTTTAGTTA . AGTACTTGAA TGTCACGTCT 
ATCATTGTAC CTCTTACACT TGGTAAATGA GCAGAACGAC TCGTCATAGT AATTGGTAAT 
TTTGGATGAT GAGCAGCAAC ATAAC GGACA ACATCTAAAC TGGCAAGGCC TGTACCAATA 
ATCGCAATAT CGTCCAGTTC ATTTACTTCG TCTAACGTAT TATATGTTGG ATAAGGCGTA 
gcGAJATATC CTTTTTTACC CTTTAAGTTA TATGGATCAT GGTAGGCAAA TGTACCACAT 
GTTAAAAATA CATAATCGTA CGCTTGCCAT GATTGTCCTG AATTTGTAGT ACATATGTAA 
TAAGTTAAAT TCGTTTCATC GATATTAGAA TTTGTATAAA TCTCTTGAAC TTTATTATAA 
TTAGTTGATA TATTTGGATA TTTTTTCGTG AACATAGATA AATAAGATTT CATATAATGT 
CCGAATACAA ATCTCGGTAA ATATGCAG 
(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7588 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

nCTAGGTATT TTAAACCTAA TCTAGATAAA CTAGCTTCGT AAGCAGCTGC TACATTTTCA 60 

CGACCGAAAT CCTCAAAATA TAATTTTGAA GTAATAAATA AGTCTTCTCT AGCAATACCA 120 

GTTGACTCCA ATCCGGCACG AATGCCAGCA CCTACTTGTT CTTCATTCCC ATAAACTTTT 180 

GCGGTATCAA TACTACGATA TCCTTGTTCA ATGGCATACT TAACACTTTC CATGCAATTT 240 

TCATCATTTT CCACACGAAA TGTCCCTAAA CCAATTTGTG GCATCGTGTT TCCATTATAA 300 

AATGTTTTAA CCTCCATAAA TATCGCCTCA CCTTTTTGAT GTATTATACC CTGTTATCAT 3 60 

AACAAATCTG AGTTGAATAC ATGAGAAAAA ACACTTAGAG CAATCAACCA CTAAAATTCT 420 

AGTAATATCT CTCAAATATT AATCAAATTG TAAAAGTAAT TCTGTTTAAT TTATGACAAA 4 80 

CTAAAAAAGC CGAAGTAACA ACATATAGTC ATCACTTCAG CCTAACATTT AATTGAATGA 540 

TTCAATTTTA TCCATCATTT GTTGTAAGTC TTCCACGTTG TATTGAATAC GACCATGGAA 600 

TACAAATTTG TTAAAGAACT CGTCTAATTG TTCAGCACCG ACAAGCACTT TGACAGCACT 660 

ATTTTGATTA TAATTTGAAA TCGTTACATC GCCTTCATTT TTAAGATTAA AGTATAAAAT 720 

25 TGAAGTTGGT GTATATTTGG CACCTAATTC TTTTTGTAAG TCTTCAGCCA ATTGTTTAAT 730 

CGCCTCAATT TGATCTGAAT AATTTACAAA TGATAATGAA CGTTTGTCAT CATTTTGATC 840 

CATCACAATA GTTTGCGGTC TAGATTTATC TAAATCCAAT GTATCAAATA CTTGTTCCAT 900 

TGGTGGTAAA TCTTTAAAIT GACCGCCACT AATACCATTA TAAACATGAC CTTTTAACAA 960 

TTGAGAATCA ATAATATAAA GACCAGTTCT TGTTAATACT AAATGACTAA TTCGTTCAAT 1020 

ATTATTAAAG CCATCCTTTG GTAAAAAGAT ATTTGCCATA ATGTGCATAT CTTCTGGTCG 1080 

AATTCGTTTT TCTTTAACTA ATCTTTCACG AATACCAATT AATCTCATGT CCGTTACATA 1140 

TTCACTATGA TTTTTCGAGA ACAATTTTAA TGCGTCAATC TCACGATCTT TTGTACTAAC 1200 

CATGTGATTA TAATCTTCTT GTTGTTTTGT AATTGTCTTT TTATTTTGAA TACGCT CTTT 12 SO 

CTCTAAAGCT TCTTCATGAG ACTTTTTAAT GTTTTGTTCT TGTTGTTCAT ACTTTTCTTC 1320 

TGTTTGTCGC TTAACTTTTT TCTTACTACC TAAGGCAACT AAAAAAAGGA CAAAAAAGAT 1380 

45 TAATGCAATG AgCTACTGCA A7AATGAGTC CAATGACTAT CGGTGAAGAT AAATCCATCA 144 0 

CAACAACGCT CCTTTTTAAT ATATGAATAA CTTTAATTAT AATAGAaAAG CTAAAGATTT 1500 

TCGATACATA TTATCATTTA TATACCGAAA ATCTTTTATT TAG CT AT ATT CAATTCATCT 1560 

TATTATTTTA CTGCGTCTTT TAATTCTTCC ACTTTGTCTA ATTTTTCCCA TGGGAATAAG 1620 

ACATCTGTAC GTCCAAAATG ACCATAAGCA GCAGTTTGTT TGTAAATCGG TTGTTTCAAA 1630 
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AGTTGCCCTT CAGAAACTTT ACCTGTTCCA AATGTATCAA TTGCAATTGA CACTGGTTCT 1800 

GCAACACCAA TCGCATATGC CAATTGTACT TCACATTGAT CTGCTAAACC TGCTGCAACA i860 

ATATTTTTAG CCACATAACG TGCAGCG TAT GGAGCTGAAC GGTCTACTTT TGTAGGATCC 1920 

TTACCACTGA AGCATCCGCC ACCATGACGT GCATAGCCAC CGTACGTATC AACAATGATT 1980 

TTACGTCCTG TTAATCCTGC ATCACCTTGA GGTCCACCGA TTACAAAGCG TCCTGTAGGA 204 0 

* ■ 

TTGATGTAGA ATTTAGTTTG TTCATTAATC AAGTTTTCTG GAACAG7TGG ATAAATGACA 2100 

TGTGCTTTAA TGTCTTCTTG AATTTGTTCA AGTGTCACAT CCTCAGCATG TTGTGTTGAT 2160 

ACGACAATCG TATCAATACG TACTGGGTTA TCATTTTCAT CATATTCAAC AGTGACCTGA 2220 

ACTTTACCGT CTGGTCGTAA ATAATTTAAC GTACCATCTT TACGCACATC TGATAAACGT 2280 

TTTGCCAATT GATGTGATAA ATAAATTGCT AGAGGCATAT ACGTCTCTGT TTCATTCGTT 234 0 

2Q GCGTAACCAA ACATTAAACC TTGGTCACCT GCACCTGTTG CTTCAATTTC TTCTTCGCTA 2400 

TCTTTATCAC GATACTCTAA TGCTTTATCC ACGCCTTGTG CAATGTCAGG TGATTGTTCA 2460 

TCAATCGCAG TTAAAATTGC CATTGTTTCA TAATCATAAC CATATTTTGC TCTTGTGTAT 2520 

25 CCAATTTCTT TAATTGTTTC TCTAACAACT TTCGGAATAT CAACATATGT TGTTGTAGAA 2580 

ATTTCGCCGG CGATCAATGC CATACCTGTT GTAACAGTTG TTtCACAAGC TACACGTGCA 2640 

TTTGGATCGT CTTTTAAAAT AGCATCTAAT ATTGCATCTG ACACTTGGTC AGCGATTTTA 2700 

TCTGGGTGTC CTTCTGTAAC AGACTCTGAA GTAAATAATC GTTTGTTATT TAACATAGTT 2760 

TGCTCCTTTA AATTTATATT ACGAAAATTC TCTCTCTGTG AGCTAAATAA AAAAGACCTT 2820 

CTAACTATTA ATATAGAGAG AAGGCCTAAT ACGTCCATTC GCTCTTATCG TTCAGACCTA 2830 

TTTGTCTGCA AAcGGTTTGG CACCTTTCTT TTATAAAAAA GAGGTTGCTG GGTTTCATTG 2940 

GGTCCATGTC CCTCCACCAC TCAGGATAAG AGAATCCGTT AAAAATAATA GTACCTAATT 3000 

AATGAATTAA TGTCAATTTT TCACAAATAA ATTTACAGTA AAATATTGTA GATTAATTAT 3060 

GTTAATGTGT TATACTAATT AAATGTAAAG GCTTACATTT AAATTATCGC TTTGGAGGGA 3120 

TTTAGGATGT CAGTAGACAC ATACACTGAA ACAACTAAAA TTGACAAATT ACTGAAAAAA 3180 

45 CCAACGTCAC ATTTTCAACT TTCGACGACA CAACTTTATA ATAAAATCTT AGACAATAAC 3240 

GAAGGGGTAT TAACAGAACT TGGTGCTGTT AATGCAAGTA CTGGAAAATA TACTGGTCGT 3300 

TCGCCTAAAG ACAAATTTTT TGTCTCTGAA CCTTCATATA GAGATAACAT TGATTGGGGA 3 360 

GAAATTAATC AACCTATCGA TGAAGAAACT TTCTTGAAGT TATACCATAA AGTACTAGAC 3420 

TATTTAGATA AAAAAGATGA ACTATACGTA TTTAAAgGcT ACGCTGGTAG CGATAAAGAT 3480 
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ATGTTTATTA GACCTGAATC AAAAGAAGAA G CTACAAAG A TTAAACCTAA CTTCACTATC 3600 

GTTTCTGCAC CACATTTTAA AGCAGATCCA GAAGTTGATG GTACTAAATC TGAAACCTTT 3660 

GTCATTATTT CATTTAAACA CAAAGTCATT TTAATCGGCG GTACTGAATA CGCTGGTGAA 3720 

ATGAAAAAAG GTATCTTCTC TGTAATGAAT TATCTCTTAC CGATGCAAGA TATTATGAGC 3780 

ATGCATTGCT CAGCAAACGT TGGTGAAAAA GGCGATGTTG CATTATTCTT TGGTCTATCT 3 84 0 

GGCACTGGTA AAACAACCTT ATCGGCTGAC CCACACCGTA AACTAATCGG TGATGATGAA 3 900 

CACGGCTGGA ATAAAAACGG GGTCTTTAAT ATCGAAGGTG GCTGCTATGC AAAAGCAATT 3960 

AATCTTTCCA AAGAAAAAGA ACCACAGATT TTTGACGCAA TCAAATATGG TGCAATTTTA 4 020 

GAGAACACTG TAGTTGCAGA AGATGGTTCA GTGGACTTTG AAGACAATCG TTATACAGAA 4 080 

AACACGCGTG CCGCTTATCC AATTAAT CAC ATTGACAATA TTGTAGTACC ATCTAAAGCA 4140 

GCACATCCAA ATACAATTAT TTTCTTAACT GCGGATGCAT TTGGTGTTAT TCCACCGATT 4200 

TCAAAGTTAA ATAAAGACCA AGCAATGTAT CATTTCTTGA GTGGTTTCAC TTCTAAATTA 4260 

GCTGGTACAa GCGTGGTGTG ACAGAACCTG AACCATCATT CTCAACATGT TTCGGAGCAC 4320 

25 CGTTCTTCCC GTTACACCCT ACTGTTTACG CTGATCTATT AGGTGAACTT ATCGATTTAC 4380 

ATGATGTTGA TGTTTATCTT GTTAATACTG GATGGACTGG CGGAAAATAT GGTGTAGGAC 4440 

GTAGAATCAG CTTACATTAC ACACGTCAAA TGGTAAACCA AGCGATTTCT GGCAAATTGA 4500 

AAAATGCAGA ATATACAAAA GATAGTACGT TTGGTTTAAG CATTCCTGTA GAAATTGAAG 4560 

ATGTACCGAA AACAATTTTA AATCCAATTA ATGCTTGGAG CGACAAAGAG AAATATAAAG 4 620 

CACAAGCAGA AGATTTAATT CAACGTTTTG AAAAGAACTT CGAAAAATTT GGTGAAAAAG 4680 

TTGAACATAT TGCTGAAAAA GGTAGCTTCA ACAAATAAAT TTGAATACTA AATCaAAACC 4740 

ACCGGTGTGA ACGGGTGGTT TGTTCTGCGG CTATAAGCCT TCCTTACTGG CCAGCCCTAA 4800 

AAGGGCACTG ACAAGTCAGC CAACTGCACT ACTATTCCAG CAACCCTAAA GGGTTACTCT 4 860 

TTTTTCTTTC TTTTTTTATT TTTCTCTCCA GTGAAAGGAT CTAAATATTC TTCCATTGAG 4 920 

ATTTGGTCTG CAACGATATC CTCTTGTAAT TGATTACGAA TATAATTTTC AATCACTTTT 4 980 

45 TTATTTCTAC CTACTGTATC CACATAAAAT CCTT7ACACC AAAACTTTCT ATTTCCATAT 504 0 

CTATACTTTA AGTTAGCATG TCTATCAAAT ATCATTAAAC TAC TTTTT CC TTTTAAATAG 5100 

CCAACAAATG ATGATACCCC AAGTTTGGGT GGTATACTAA CTAACATATG GATATGATCT 5160 

TTACATGCCT CTGCTTCAAT TATCTCTACA CCTTTTCTTT CACATAATTG ACGCAATATA 5220 

ATCCCTATAT CTTTTTTTAT TTTTCCATAT ATCACTTGTC TTCTGTATTT AGGTGCAAAG 5280 
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AAATAGCATC TCCTCGTGTT GATTATTTTG GTTGGCTGAC CAATATTTAT TCTAGCACGT 54 00 

AGAGATGCAT TTTTTGTGAC AATGGTAGAA CCTTTTCtGa ACCATACGCA TAGCGTATGG 54 60 
TTTTCTTTTT ACAATTAAAG AGCCAACCGT TGTTATAGTC TAACAATGGT TGGCTCCTCT ' 5520 

TATTTTATGT GCTAAAAATT TATAGGCAAT TTTATTACAA CAATGTACAT TTAAGGTGAC 5580 

CTTCATGCCA AAATCGCATC ACTCATTTAA TGGAAGCAGC ACGTCTTCAT ATAAAGTACC 564 0 

GATCCCTAAT TCAACGCATG TAGTACCACA TCTTCAAAGC TTGATAGTTC CCATGCGCAC 5700 

ACCACGTTTC ATACTAGCTA TGCGACTCAA CTTGGTTCAT AAACTCTTTA ATATAAGTCA 5760 

ATGTTTCAAC CATCGCTGGT GGTCTTGGCA CATGTCCTTC TGCCATTTGA TAAAATGTTT 5820 

CATGCGTGGC ACCTTTTAAC TCTAGTTGGT CCGCTAAATA ATACGCATGA TGAATACCAA 5880 

CTTGCTGGTC TTTCCCTCCA TGTACAATTA ATATTGGCGG ACTGTTTTCA TTAATGTTTG 594 0 

20 GAATCGCTTG GCGTGCCTCA TATGCCGCTC GATCTTTTTT CGGATGACCA ATCATTCTTC 6000 

GTAGCATGCC TCTTAAATCG ACACGTTCTT CATACATTAA AT CAATATCT GAGACACCAC 6060 

CCCAGATTGT ATAACTTGTT ACTGGTAAGT CTTGAAATGT CAACAATCCT TGTAAACCAC 6120 

25 CTCGCGAAAA ACCAACCATG TGGATAAATG CATGTGGATA TTTATCATGT AGCAACCTTA 6180 

ATAATTGCGT CACATCATTT AAATCGCCAC GGTAAAATTC GTCTTTGCCT TCACTCCCAT 6240 

TGTTACCTCG GTAGTATGGC CCAATCACTA AAGTTTGACT ATCTGAAAAT TGCATTAATC 6300 

TAC CTGCGCG CACACGTCCT ACTTGACCTT TGCCACCTCG CAAATAAACT ACAATGCGA7 6360 

TTACTTCATG ATGTGGTGTC AT CATT AAAG CTTTTACTTG TAAGTCATCT GACAAATATG 6420 

TAATTTCTTC GAATTGATGC GTAAAATATT CAATTGGCAT TCGTTTACGT TTGATAAAAC 64 80 

CCAAGTGATT GCACCCTCTC TACGCATTTT AAAATGGTAC TATCTTGCAG TAAGAAACTC 654 0 

CGTTGTGCGA GTTCAATATC ATTGATACAG TTAAACAACA CTGGCCCTGC TGTTTCTAAA 6600 

TAATCGTTCT TGCTTACCAA TG ATT CAACT TCGATAAAAT ATACATCTTT TACAAAATCA 6660 

GTTTGATCAT GTGTTTCAAT GGTATATTGT GCTATGTAAT AAATATTTTT AACTTTGGCG 6720 

CCTGTTTCTT CATATAATTC aCGTGTAACT GCTTCAGCAC TACTTTCCCC GCGTTCCCTT 6780 

45 TTACCACCAG GAAATTCAAT CCCCCGTAAA TTATGTTTGG TAAAAAGCAA TTGATTTTTA 684 0 

AACGTTGGAA TAGCTAGCAC ATGATTGCCA TCTGCTATCT CATTATC CTT TTTAAATGTC 6900 

AAATTAACTT GACGATTATC TTTATCCCTA AACTTCACGC GCATCACATC CCTACATTGT 6950 

ATGTTAATAT AATAGTTAAT TACTATCGTT GGAGGCATTA ATTATGAAAA AGATATTCTT 7020 

GGCGATGATT CATTTTTATC AACGTTTCAT TTCGCCACTC ACTCCACCAA CTTGTCGTTT 7080 
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CCTTTATTTA GGTATCCGTC GTATTTTAAA ATGTCATCCG CTTCATAAAG GCGGCTTTGA 7200 

CCCTGTTCCG TTAAAAAAAG ACAAGTCAGC AAGCAAGCAT TCACATAAAC ATAACCATTA 7260 

ATATGGTTGT AATTGAGTTA TATC CACTAA PGGGGQGQGK AATTCGAGTC GCCCCTCTTT 7320 

TAATATGCCT GAATGCGCCA CCACATCTTG TTCAAAATAA TAACCTGCTG GTGTAACATC 73 80 

TCCTGGATAA TCACCTTTAC GAGCAAGCAT CGCTGTAAAA TAGCGGCTTA AACCATATTC 744 0 

GTACATGCCG CCAATAACCA CTTTTGCACC ATGACTTTTC AAAGTATCAA TTGCCGTTTG 7500 

CACTTTATCA ATGCCACCTA GACGAAATGG TTTTAATACA ACAACTTTCA CATTGTATAA 7560 

TTCTATCAAA TTAATTATGT CCaACAACGA TGTTGCCTTT TCATCAAGGG CTATTGGAGG 7620 

TATTGTTCCA TCCGCTACTT CATCAAGCAT GGAGATATCT TTAAATGGCT CTTCGATATA 7680 

AAGAACCTGT TCACGCGCTA ATAACTGTAA CTGTGTGAAA TCTTGACGAT CCAAGGACTC 7740 

ATTTGCATCT ATAACCAATT GAAAGTGAAA GTCTAATTCC CGTAACACTC 7AATTTGATG 7800 

CATGATTTGA GGCGTCCATT TTAATTTAAT TCTGGTCGGC TTTGTTGCTT TTAATGACTC 7860 

TAGTTGTTTA TTTGATAAGC CGCTCGcTGT CGCTCCATAT GCTACTGAAA ATGAAGGCAG 7920 

25 TACATGAAAC ATTTGATACA ATGCCATGAC AATAGTTGCC CTTGCAGCAG GCGTATTTTC 7980 

CAATGAATCT ACTAATTTTA GTGCTGCTTC ATACGTTTCA AATGATTTAT TTCTATTATC 8040 

TTCGAACCAT TGCTCAATTA CATGTTTCAC TGAGGCAATT GTTTCATGAT CATACCAATC 8100 

TGTTTGAAAA GCGTTACATT CCCCGAAATA TGCATTTCCT TTGTCATCAA TCAATTCGAT 8160 

AAACAAACAA TCACGATGCG TTAAAGTGAC TTTCGGTGTT ACAATTTGTG ACTTAAATGG 8220 

CTCACTATAT TTATAAAAAT GCAAAGCTGT CAACTTCATC AAATCATCCT CTATACAACT 8280 

TATTTCTTTG TAATTTACCT GTTGATGTAT AAGGTAAAGT ATCAACCTTT TCAAAGTGTT 8340 

TCGGTACTTT ATATTTCGCT AAATGTTGTG ATAAATATG C AATCAATTGT GCCTTTGAAA 8400 

TGTCACTTTC ACTGACAAAA TATAATTTAG GCACTTGGCC CCAAGTATCA TCAGGATGCC 84 60 

CTACACATAC TGCGTCACTG ATACCTGGAA ATTGctTCGC TACCGTTTCA ATTTGATATG 8520 

GATAAATATT TTCACCGCCA CTAATAATTA AATCTTTACG TCGGTCATAA ATCATGACAT 8580 

45 AACCTTCATG ATCTATTTCA GCAATGTCAC CCGTATTAAA ATAACCATTT TCAAACGTAC 8640 

CCGTTAAATC TGTTGGATAC AAATATACAT TCATCACATT GGCGCCTTTA ATCATTAATT 8700 

CTCCATGACC TTCTTTATTA GGATTTTTAA TTTTTACGTC AACATTGGCA CTTGGCATCC 8760 

CTACAGTGTC AGGACGTGCA TGCAACATTT CCGGTGTTGC TGTTAAAAAT TGCGAACATG 8820 

TCTCAGTCAT ACCAAATGAA TTATAAATTG GCAGGTTATA TTGTAATGCC GTCTCTATCA 8880 
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AACCTTGTTG CATAAGCCAA TTTAAAGTTT GTGGCACAAG CGAAATGTGC GTGATTCGTT 
CATTTTTAAT CATCGTTAAA ATTTGTTCGG CATTGAATTT ATCAACAATG CGCACAGTAA 
AACCTTCAAT AACAGCTCTT AAAAGTACAC TGAGACCCGA AATATGATAA ATCGGCAAGA 
CAGATAGCCA ATTAGTGTCA CGATCAAATC CCAAGCTCTC TTTACATCCG ATTGCACTGG 
CATAATGATT ACGAAACGTT TGTGGCACCG CTTTTTGAGG GCCCGTTGTC CCTGATGTAA 
ACATAATCGA TGCAATGTCA TCTAAATTAA ATGATGTATT TAATATGTTG GACGGCGACT 
CTTTCGGCAC CACAGTTTCA TTCGATGTTT CATATTGGAT ACCCATTGTG TTGTCCAACA 
AACTGTTCGT TGTAATATCC CTTCCAGCGA ATTCAATATC ATCCAGCGAT ACAATTTGAA 
ACCCTCGTAA TTCCAGTGGC AAGGTACAAA AAATCAATTG TACATCGATT GACTTCATCT 
GATTCGTCAT CTCATTAGGT GTCAACCTTG TATTAATCAT CGCAATTTCA ATATTTGCCA 
AC CAACATGC ATGTATTAAA ATGATCGATT GAATCGAATT ATCTATGTAT AGCCCAACAC 
GAGATTGTTG ATAAGCCTTG AGTCTTTTAG CCAATAGACT CGCTTCACAG TATAAATTTT 
GATAAGTATA AGATTCTTGA CCGTCTGTTA TCGCAATATG ATGTCCATTT TGTTGTGCTT 
GTTTATATAA CCAAAAGTCC ATGCGTTATT CCTCCAAAAT CATTTACATT ATAATTATAA 
CGATTTTATG ACATTCTAGC AGTGGTTATG TTTAAAAATA TAAAAAAGTA GACGAATTGA 
TGCATTGATA TGATTGTTAT AATGCTCAAT ACATATCGTT ATATCATTCG TCTACTATTA 
TCAGTTATTT TTATTTAATT TTAGTGTCAT TCTGTCATTT TGATGTGGTG ATTTACCCAT 
TGTTGCCACA TCATCTGCAA TGTCAATTGG TATACGGTTC ATGTCTTGTA ATGCACTTAA 
ATGGAATACT TCATCATCTA AATTTTCAAT GAGATATACA TAATATGTTA CCTTGTCCTT 
TTTATATTTT AACGTTTTCC AAAAGTCCGG CTTGCAATTC AATACATTAT CCGGAATATA 
TTCAATAAAT AAGTAACGTT TGCTGCCTAC TTTGTCTATG AAATATTTTG CAGTGCCTTT 
TTCTATACCT CTTATATGTG CATAGTCTGC TGAAAAGTAA ATACTACCTA TTGTTTCATT 
ATGTTGTTGT ATTTCAAATC GTTGGCCTAC TATTTTATTA TTTGTGCTAC nGGGGACTTA 
[2) INFORMATION FOR SSQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1477 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
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kAATACTGTC ATCAATATGA TAAGTTACAA 120 

kTACATCTCC AAGCAATATC ATTTGCGmTA 130 

•AATTCCTGG CGTCTTAATC GTTGTAGATG 240 

tACTGTCACC ATATGCTAAC GGCGCTGCAG 300 

ITGTAATATG CACAACAAAG TCTCCAGTCC 3 SO 

'ACCTCCTGC ACGTTGAACT GCAATAGCAA 420 

AAAGCTAAA TGGTGCTGCA TTTACTGATG 4 80 

AGGTCGTGG AATAATTGAA CCAATTAATA 540 

TGCATCAAA CGTATACATA ATACCTACCT 600 

.TTTTGGTTA ATTTAAACAT CTATTCTCCT 660 

TTTTGAAAT ATGACACATA TGCATATCTT 720 

TTCAGCACT TTTAATGTAG TTAGACAGCG 780 

TTTACGTCG TTCAATGAAC TGACGCGCTT 840 

ACCCGTTCG ATACTTTTGT CCAATATCAT 900 

GATTTCAAA TAAATAATTC ATAATGTCTG 960 

TTTGACACA TTCAGCATAA CCATCATACG 102 0 

TCTTCCTGC TTCTGTATGT ATAATTCCAG 1080 

TAAACATCC TCCTGCTACA TAAACAACTG 114 0 

TTTTTAACA AGGTTATACC ATTTAATACC 120 0 

CGATACCCA TATTTTTCAT AAAATGAAAT 1260 

TGTCGATGA TGATTCTTAG CAAGAGTTTC 1320 

TGACCTTGA TAATTTGGTG CTACAACAAG 13 80 

TTATTTGTG GAGACATTTT TAAATAAATC 14 40 

CCGTTG 14 77 
45 (2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 976 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 

{D) TOPOLOGY: linear 
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GTGTGGATTG 


GATTTTAAAA 


TCACCCTCAT 


TTTCACCTAT 


TATTAAATCA 


GCCCCATCTA 


GTTTACATTC 


GAATCTCATT 


TTCGCATCTT 


TTAAAAGTGA 


TAATTCTGTA 


CGACTCAACT 


TCTCATTAAT 


ATCTTGAACA 


TTATCTTCGT 


GTTCTATATT 


TAATGCAGTA 


TCTTTTCTCT 


TCATTGGCGG 


ATGATTATTA 


ACAATATTAA 


CATCTTGATT 


TAATGTTGTA 


ACAAAAGCTA 


ATTTATAGTT 


TTCTCTAGCA 


GTTAATGATT 


CTTTTCTAAG 


TATATCTAGG 


TATTTCTCCG 


CTGAAAATCA 


CTTGTATTTA 


TTTAGCAAAT 


CTGGATATTT 


TTCTAAATGT 


TGCTGATGTT 


GTAAGACTTC 


CACTGCAATT 


TGATCTCTGT 


CAATTAAGTG 


GTCATCTACA 


CAACTATATA 


TTCCTTGTTG 


ATTCACACTG 


TAAGGATCAA 


TAATTGTTAA 


CATACGATCA 


TCGAAATGAA 


GAC CGTCTAA 


TTTAGAGCTT 


CTTCCATTTG 


GTATTGTTGC 


AAAAAATGCT 


TCAACACCCC 


CCATATTTAC 


ACCTCATCAT 


CCTTTTTTAT 


GCCATGACAT 


GATTCTGATA 


CACCT^CATT 


TAATGATTCT 


CGACATGTTA 


ACGTTACACC 


AAAATAGTTT 


AGTAAGCGAC 


CTGCAATACC 


ACCTAACACA 


CTAATATAGC 


CACCTTCACT 


ATCGCTAATG 


TAACGCTCTT 


TTATGACTGG 
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AGGTGATTAT CCTAAAAATG CTCATGAGGT CGCTATTAAT GATAAGTTAG CTGCAGACAA 60 

CATTAGAGTC GGGGATAGAT TACATTTTAA AAATAATTCA ACTAGTTATA GAGTTTCTGG 120 

TATTTTAAAC GACACAATGT ATGCGCATAG TTCCATTGTG CTATTGAACG ATAACGGATT 180 

TAATGCATTG AATAAGGTTA ATACGGCATT TTATCGAGTG AAAAATTTAA CACAACAACA 24 0 

ACGTGATGAG CTTAATAAAA TAAATGACGT TCAAGTTGTG AGTGAAAAAG ATTTAACAGG 300 

TAATATTGCG AGTTATCAAG CAGAGCAAGC ACCGTTAAAT ATGATGATTG TTAGTTTGTT 360 

TGCTATTACA GCAATCGTTC TAAGTGCATT TTTCTATGTT ATGACGATTC AAAAAATATC 420 

ACAAATTGGC ATTTTGAAAG CAATTGGTAT TAAGACAAGA CATTTATTGA GTGCGTTAGT 480 

■ 

TTTACAAATT TTAAGACTAA CAATAATTGG GGTAGGTATT GCTGTGATCA TCATAGTAGG 54 0 

ACTATCATTT ATGATGCCGG TAACGATGCC TTTTTACTTA ACAACGCAAA ATATTTTATT 600 

20 AATGGTGGGG ATATTTATAT TAGTAGCGAT TTTAGGTGCC TCACTATCAT TTATCAAATT 660 

ATTTAAAGTG GATCCTATCG AAGCAATTGG AGGTGCAGAA TAATGGCATT AGTCGTTGAA 720 

GATATCGTCA AAAATTTCGG AGAAGGTTTG TCTGAAACAA AAGTTTTAAA AGGTATTAAT 780 

25 TTTGAAGTGG AACAAGGGGA ATTTGTCATT TTAAATGGTG CCTCTGGTTC TGGGAAAACA 84 0 

ACATTGCTAA CGATATTAGG CGGATTGTTA AGTCAAACGA GTGGTACAGT GCTTTACAAT 90 0 

GATGCGCCAT TGTTTGATAA ACAGCATCGT CCTAGTGATT TACGATTGGA AGATATTGGT 960 

TTTATTTTTC AATCTTCACA TTTAGTTCCT TATTTAAAAG TGATAGAGCA ATTGACACTC 102 0. 

GTAGGTCAAG AAGCGGGAAT GACCAAACAA CAAAGTTCAA CAAGAGCAAT ACAACTTTTG 108 0 

AAAAATATTG GTTTAGAAGA TCGCTTGAAT GTATATCCGC ATCAGTTATC TGGCGGTGAA 114 0 

AAGCAACGTG TTGCGATTAT GAGAGCATTT ATGAATAATC CGAAAATCAT TTTAGCAGAT 1200 

GAGGCCACAG CAAGTTTAGA TGCCGATAGA GCAACAAAAG TTGTTGAGAT GATACGTCAA 1260 

CAAATTAAAG AACAACAAAT GATTGGTATT ATGATTACAC ACGATCGAAG ATTATTTGAA 132 0 

TATGCAGATC GAGTGATTGA ATTAGAAGAT GGCAAAATAA CTGATTAGTG GCTTGTAAAG 1380 

ACGCTAAATG TTAATGATTT AAGACATAGT AGTATAAAAG TTAGATAACA GAATACGATT 144 0 

45 TGGGTTTACA AAAAACAGGC TGGGACATTA AGTTCTTAGG CAATGTAAAA AAGCTGATTT 1500 

CTATTAATTA TTTGATAGAA ATCAGCTTTT TTGATATGTA TTTTATAATG TACAGCTCGT 1560 

TGCATTCATA TAG CTTGAAG TCACGTTTAA AACCATATCT ATCATTATGG TATGCATATC 1620 

TTTTAAAACC TATTCTTTTG TTATTAGGAC ATATAAATTC ATCATTAAGT TCGTCATATT 1680 

TCCAATTTTG AGTGTTAAAA ATGTCACTTT TAAACTTTCT AGTTTTATCT TTAATAAACA 174 0 



55 



30 



35 



40 



50 



752 



10 



15 



20 



EP0 786 519 A2 

CACTATCATA ACATGCATCA GCTACAATAT ACTCCGGTAA ATAACCGAAG nTATTTTgAA I860 

TCATTGTTAA AAATGGAATT AAAGTTCTAG TATCTGTTGG GTTTTGAAAT AGGTCATAGG 1920 

ATAAAACAAA TTGAGAATTT GTCGCTATTT GTAAATTGTA TCCTGGCTTA AGTTGGCCAA 1980 

AGTGTCTTAT TTTTTTAAAG TATTTAAAAG TAAAATTACA TGTTAATACG TAGTATTAAT 2040 

GGCGAGACTC CTGAGGGAGC AGTGCCAGTC GAAGaCAGGG GCCCCAACAC AGAArcTGAC 2100 

ATATAGTCAG CTTACAACAA TGTGCCGGTT GGGGTGGCTG AGACGGCACC CTAGGAAGGG 2160 

ACCCGTCATC AAAAATTCTA TTTATAGAAT TTTACAGTAA TGTGCCAGAT GGGCATAGCG 2220 

AAg cCATTCA ATACGAAGTA TTGTATAAAT AGAGAACAGC AGTAAGATAT TTTCTAATTG 2280 

AAAATTATTT TACTGCTGTT TTTTTTAGGG ATTAATGTCC CAGACTCTTT AGTTTATTTA 234 0 

TTTTCAATAT AACAATTGTC TAATCAAGGA TTAACGAATA TTTAAAGATA GTTTGACGCA 2400 

ATATTAGAAA CAACCTATAA TAATAGTTTG TTTGTGGATT AACTATTATA AATAAAAGCG 24 60 

GCGTAAAGAC ATATAAACCA ACTACTTGAA CAATATAACG TTAATAACAA TCTATACTGA 2520 

TACATTACGC CTAGATAATC TTTGATGAGC ACATGTAAGA AAAAGTGATA TGGTGTATGA 2580 

25 CTTCCGACAC CATCGATAGA TAAACCTAAT TTTTGGGCTA GTCGTAAGGC GCGCAATACA 264 0 

TGAAACTGAC TTGTtACACA AACAATTTTA ACTGCTTCAT GATACAAATT GTTGATGATT 2700 

TGTTTAGAAT ATAAAAAGTT TGTGTATGTA TTTATAGAGT GAGATTCCAT TAGTATATCT 27 60 

GTTTTATCAA CACCATGTGC AATCAAATAA CGTTGCATAG CTAAAGCTTC AGAAATTGGT 2 82 0 

TCGTCTGGTC CTTGTCCGCC AGATACAATG ATCTTTGTTG CTGATGCTTG TTGTTGATAG 2880 

ATATCAAGTG CACGATCTAA ACGCGCTGCA AGCATTGGTG TGACAAATTC GGTAAAAATA 294 0 

CCAGCACCTA ACACAATTAT GATATCAACT TCTTTGTTGT ATGATCTATG TCTATATGAT 30 00 

ACTOTCCAAA CGAGATAACA AATAAAGGTT AGTAACAGGG AAAGACATAA TATAGCTAAC 3060 

CACATAGACA AACCTTTCAC AATAGGTGAC TGAATCGTAC TTATAAATAG AAGTGCTGAT 312 0 

GTGTAGAGTA CAAATTTATA TGAAAAAGAT AATAATTTTT TAATAAATAA GCGACTAGAA 3180 

GTATGAGAAA ATAAATATCT ATGTTTGAAT AGCATGATAA TACTGATTAT TATAAATGTT 324 0 

45 ACAAACATAG ACCAAGGGAA AGTATAGGTC ATGATGCTAT AGATGAGTGA CAAAAATATC 3 3 00 

GATATGACAA CTAAGATGTA GCATGTTAAA TTTAACGTCA GAGTATAGTT GAAAATTAAC 3360 

GGACAAATAA CGATAAGTAT AAATATTAAT AATAAATTCA ATAACATACT GACACCTCGC 3420 

TTATAATAAA TATTAAATAT AAATGTAGAT GATTTAATTT ATTAAAGCAA GGAGAAAGCA 34 8 0 

GCAACATGTA AATCTTAATT TGTTATATTA TATATGGGTC AATATTTTTG TGTTTTTTAG 3540 
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TATGGTAAAA CATTTACAAG ACCATATTCA ATTTTTAGAG CAGTTTATAA ATAACGTTAA 3660 

CGCATTAACT GCAAAAATGT TGAAAGATTT ACAAAATGAA TATGAAATTT CATTAGAGCA 3720 

GTCTAACGTA TTAGGTATGT TAAATAAAGA ACCTTTGACA ATTAGTGAAA TCACGCAAAG 3780 

ACAAGGTGTA AATAAGGCCG CAGTAAGCCG ACGAATTAAA AAGTTAATCG ATGCTTAATT 3840 

AGTTAAGTTA GATAAACCAA ATTTAAATAT TGATCAACGT TTGAAATTCA TAACCTTAAC 3900 

TGACAAAGGT AgAGCATATT TGAAAGAACG TAATGCGATT ATGACAGATA TTGCGCAAGA 3960 

TATTACTAAT GATTTA 3976 
(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3346 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO : 14 6: 

25 GCTACCTAGG CATTTAAGAG ATCAAAAAAT GTATGAATAT GAACGTTATT TTTATGAGCA 60 

AGAACTTAAT GGCGTTGATG aAGGGGAAAT TTTAAAGAAG TTAAAAGACC CACAAGATGT 120 

TGCAGCTGAA ACAAAAGCTA GAAGTGTTAT TGATTATGCT GAATCTAAAC CAACATTTGA 180 

AAATATTTCA AGAGCTGTTG CTGCTTCATT AAGTTTAGGC ATTCTATCTA TTTTTGTCAT 24 0 

CCTTATACCA GTATCTATAG TTGGATTATT TGTATTAGCA TTATTTTTAA TATCACTTTT 300 

GCTGCTGTTT TGTCCAATTA TTTTATTAGC ATCAGCAATA TCCAGAGGAA TTGTGGACTC 360 

AATTAGTAAT GTATTTTTTG CCATATCATA TTCAGGATTA GGATTAGTAT TTATCATTGT 420 

CATASTTAAG ATTTTAGAAT ACATTTATCG TTTAATCTTA AAATATTTAC TTTGGTATAT 4 80 

TAAAACTGTC AAAGGAAGCG TTAGAAAATG AAGAAATTCT TTTTTATTGG GCTTTTAGTG 54 0 

TTTGTTGTCT TTTTTACAGC AGCAACCATT ATTTGGTTCA GCTATGATAA AAACAAATAT 600 

GGTACTAAAC AATATGATAA AACATTCAAA gACGATGCTT TTGACAATGT ATCTATAAAT 660 

45 TTGGATAGTA CAGAACTTCG TATAAAACGG GGGAATCAAT TTAGAGTTAA ATATGATGGT 720 

GACAATGATA TATTAATTAA TATAGTAGAT AAGACGTTGA AGATTAGTGA TAAAAGGTCT 780 

AAGACAAGAG GATATGCAAT TGATATGAAT CCTTTTCATG AGAATAAGAA AACGTTAACG 840 

ATTGAAATGC CTGATAAAAT GATTAAACGT TTAAATCTAT CATCTGGAGC AGGAAGTGTT 900 

AGAATCAGTG ATGTTGATTT AGAGAACACA AGTATTCAAA GCATTAACGG TGAAGTAGTT 960 
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AGTAAAAGTA ACATTAAAAA TAGCAATATT AAAGTTGTTA TTGGTACGCT ACAAATCGAC 1080 

AAGAGTCAAA TTAAACAATC CATATTTTTA AACGATCATG GTGACATTGA ATTTAAAAAC 114 0 

5 ATGCCATCAA AAGTAGATGC AAAAGCTTCT ACTAAACAAG GAGATATTCG TTTTAAGTAT 120 0 

GATAGTAAAC CTGAAGACAC TATACTAAAG CTAAATCCGG GAACGGGTGA TAGCGTAGTT 1260 

AAAAATAAAA CATTTACTAA TGGtAAAGTT GGGAAAAGCG ACAATGTTTT AGAATTTTAT 1320 

10 

ACGATTGATG GTAATATCAA AGTTGAATAA ATAAAGGATG TAAGCACCGA TATTAGGAAG 13 80 

CATAATTTCT CTAATATCGG TGTTATTTAT TTGTTGGCAA AAGTTAAGTC GGTATCTATA 1440 

TTGCCAGTAA AGTGAGTGAT ATTAAGGTCT TGACCATCTA ACCATGATTT GAAATCTATT 1500 

15 

ATTTCTGGTG GCGCATTTTC TCCCAATGTA AAATATGCAG TTAATGTTTC AGGTTGATAC 1560 

ATTGATGTAT GGATGGTGCC AGACCAGCTT TTGAATAGTT TACTGTAAAT TTCATACTGA 1620 

20 GGATTATTGA ATAACTTAAA TGCTGTAGTC ATATCTAAAT TATCATTAGT TTGTGAAATG 1680 

GTACGCGCCA GTCTTTCTTT AGATTCTTTT GTATAATTAC GATTTTCATG TGTTAATATT 174 0 

TCAAAATGAT TTGTACATAT ATTATCATAA CGAACATCTA TTGATCTCGG TGTCACTTCA 1800 

25 ACAATTGCAT GGTTCAATGA TTTGTCCATC AGTATGTAGC TAAATGAGCT TCTGTGTGGT 1860 

ATTTCTTTCA ATAATTGGAT TGCTTCTGTT ACATTTCGGC AATTTTCAAG AATTAGACGA 192 0 

CCAATCATAT AACATACAAA ACCATTTGCT GGTTTCTTCC GGTGCATAAA GTTATAGCCC 1980 

30 

ATAGTTAATC CTGACTCATT CATACCATCC ATTCTTCCAG TTACCCTTGA TACAGGACCA 204 0 

ATTTGAGCTA AACCGCTATC TGTAGGTTGA TAAAGTAAGT AGCGACCATC ATAAGTTGCA 2100 

GGGTGGTAAT CATAATTTCT AACCATGAAG TCTTTGCCTT GAAAGACCGT GCAaCCACTT 2160 

35 

TCTTTTAAAT CGGTAAAACG ATAATGTCCA AAGTTTAAAA TAATTTGGCG TGTTGGCATT 2220 

TTGAGTATAC TTTGTAGTCC CATTAATTCT TCCCATATTT GAGGTGCGTA TGTTTGGAAT 2280 

4Q ATTTGATAAG TTTCATTTAC ATCTATATCG AAACGTGGGA CaCnTTTTTT CCATTCTTTT 2340 

TCTCGATTTT TTAGAAGAGG TGTTTGTTGA AGCCATTTAC CAGTTTTAAC ACCTAACTCG 2400 

AAATGTGAAC CTCTAAAAGT CATGATATCT GATGTCACTT GTTGCATATC ATCGGCCCCT 24 60 

45 TTCTTTTTAG TTGTAATATA TTGTAAATAA ATAGTAATCG TATGTATATT GAATGTCATG 2520 

TTAAATAAAG TTATATTTTA CTAAATGAAA TATAAAATTG TTTGAGGTGA TTTCTCGGTG 2580 

TATAAGACTT ATCAATCAGT TAAAACATAT TTTTATAGAT GGTGGGGATA TTGAGTTAAA 2640 

50 

AACTTAAAAT CATCTTATCA TAAATATCAA TCTTAAGTTA GCATT CACGA TAATAGTCAT 2700 

TGTTAACATT AGCATATAAG GTCATGTCAC GTTGAAACAG AGGTTCCTCG GCATTTTTGA 2760 
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TTATTTAATG ATTATTCTAT ATATGATAGT ATAATGAAAT GTAGATAGGT ATTTAATTTA 28 80 

ACAGAGGTGA AATTGAGATG TGGAATTTTA TTAAATGCGT GkTTAAATTC GTATTTAGCT 2940 

TAGTTGCTAT TACAACATTA GTTGCTGGTG TTGGTGTAGT AGCATTTGCT TATATCTTTA 3000 

AAAAAGATTT TGAAGATATT GAAAGAAAAA CTAAAGAAAT TATTTCTGAT ATTGAAAGTA 3 06Q 

AAAATAACTA ATAACATTTA GAGGCTGGGA CATAAATCCC TAAAAAACAG CAGTAAGATA 3120 

ATTTTCAATT AGAAAATATC TTACTGCTGT TCTCTATTTn ATcAmTACTC CGTATTGAAT 3130 

GGCTTCGCTT TCCTAGGGTG CCGTCTCAGC CTTGGTCTTC GACTGGCACT GCTCCCTCAG 3240 

GAGTCTCGCC ATTAATACTA CGTATTAACA TGTAATTTTA CTTTGGAAAT ACTTTTAAAA 3300 

AATAAGACAC TTTGGCCCAA CTTGGCACAT AAATGTAAAA TTCAAT 3345 
(2) INFORMATION FOR SSQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
GTTGAAGAAA GAAATATAAC AGTCAATTAT AATTATAACC TTGTTGAAAT CGACGGTGAC 60 

AAAAAAGTGG CTACATTCGA ACATATCAAA GCATACGATA GAAAAACAAT AAGTTATGAT 120 

ATGTTACATG TAACACCACC TATGGGTCCC TTAGATGTAG TAAAAGAAAG TACACTTTCA 180 

GATAGTGAGG GTTGGGTAGA TGTTAACCCA ACCACATTAC AG CAT AAAAG CTACTCTAAT 240 

GTATTTGCAC TTGGTGATGC TTCAAATGTA CCTACTTCAA AAACAGGCGC ACTATTcGTA 3 00 

AGCAAGCACC TATCGTCGCT AATAATTTAT TGCAAGTGAT GAATAATCAA ATGTTAACGC 3 60 

ATCATTATGA TGGTTATACT TCATGC CCTA TTGTTACTGG ATATAATAGG TTAATACTTG 420 

CAGAGTTTGA TTATAATAAA AATACTAAAG AAACAATGCC GTTTAATCAG GCCAAAGAAC 4 80 

GTaGAAGTAT GTATATATTT AAGAAAGATT TATTACCTAA AATGTATTGG TACGGCATGC 54 0 

45 TAAAAGGATT AATATAATAA AGTACAGAAA ACAATAAATT TTTAATGAAA AATCTTTTAC 600 

TATAAAAGAT TAAGTATTTA AATGACGTGT CAGTGTTGTG TTTATATGTC GTGAATTTTT 6 60 

AGCTCTAAAT AGTATAAGAT TGAAAAAGTT GTTACTGTTT TAAATGATCA CGATGAAGTC 720 

50 ATTCAATAAG AATGATTATG AAAATAGAAA CAGCAGTAAG ATATTTTCTA ATTGAAAATC 780 

ATCTCACTGC TGTTTTTTAA AGGTTTATAC CTCATCCTCT AAATTATTTA AAAATAATTA 84 0 
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AGATATTCAA ACCACGTGTA CTCAAAATGA TAGCTTGGTA TGTACCTCCA ATAGTAATTT 96 o 

CAATAACTTT GTCTGTTGAA CACTAAGAGC AATTTTAATT TCATAATGTG TTGTAAACAT 1020 

5 TTTTTTTGAT TGGAGTTTTT TTCTGAGTTA AACGATATCC TGATGTATTT TTAATTTTGC 1080 

ACCATTTCCA AAAGGATAAG TGACATAAGT AAAAAGGCAT CATCGGGAGT TATCCTATCA 1140 

GGAAAACCAA GATAATACCT AAGTAGAAAG TGTTCAATCC GTGTTAAATT GGGAAATATC 1200 

70 

ATCCATAAAC TTTATTACTC ATACTATAAT TCAATTTTAA CGTCTTCGTC CATTTGGGCT 1260 

TCAAATTCAT CGAGTAGTGC TCGTGCTTCT GCAATTGATT GTGTGTTCAT CAATTGATGT 1320 

CGAAGTTCGC TAGCGCCTCT TATGCCACGC ACATAGATTT TAAAGAATCT ACGCAArCTC 1360 

15 

TTGAATTGTC GTATTTCATC TTTyTCATAT TTGTTAAACA ATGATArATG CAATCTCAAy 1440 

ArATCTAATA GTTCyTTGCT TGTGTGTTCG CGTGGTTCTT TTTCAAAAGT GAATGGATTG 1500 

20 TGGAAAATGC CTCTACCAAT CATGATGCCA TCAATACCAT ATTTTTCTGC AAGTTCAAGT 1560 

CCTGTTTTTC TATCGGGAAT ATCATCGTTA ATTGTTAACA ATGTGTTTGG TGCAATTTCG 1620 

TCACGTAAAT TTTTAATAGC TTCGATTAAT TCCCAATGTG CATCTACTTT ACTCATGCGT 1680 

25 TTGATAAAAA CTTAAATAAT ATTAATTCGG TCATCAGTGG CGTTAAATCT TTTATCATTT 1740 

TTAGTTATAG TTGATAAATT TATATTTATA AGCATATATG GATATTTCAT CAAAAATTTT 1800 

TATTTATATA AATCCGAACT GCATACATAT TTGTTTAAAT AAGAGGTATT ATTTTTCGGG 1860 

30 

AAATTGCTGT CTGAGTTAAA AGGATTAGTT TTATAAAATG AGTTGAACTA TAGCCAAAAA 1920 

CGATTAAAAT ACTGATAATC CATTTTTGtA TTATGTTAGG GACTTTTTTA CTTAATTTTA 1980 

ACCCTATTGG aGCmAATATA ATACTCCCTA TTATAAGGAA TAAGGCGTCA TATAAaGGGA 2040 

35 

TATAAC CTTG AATAAGTTTG ATGACAAAAG CACCAATTGA AGATATAAAA GCAATTACTA 2100 

TACTATTAGC GACTACAGTA TTCATTGGTA ATTTGAATAA AACCAATAAT ATAGGAATAA 2160 

4Q TAATGAAGGC ACCACCTGCA CCTACTATAC CTGAAATAAT ACCAATGAAA AGGCCAATGA 2220 

TAACTAATAA ATATTTATTA AATGAAGACT TTTCGGAACT AGGTTtCACT TTAATAAACA 2280 

TTAATGTTAA TGCAAGTAAA GCAATAATGA TATATACCGX ATTTACAAAT GTAGCATCAA 234 0 

45 ATAAATTTGC TAGAAATGCA CCTAACATAC TCCCT 2375 

(2) INFORMATION FOR SEQ ID NO: 14 8: 

{ij SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 6115 base pairs 

(3) TYPE: nucleic acid 
(C> STRANDEDNESS: double 
(D) TOPOLOGY: linear 

55 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
GAGGTTTCTA GACAAGCTTT TAATAACTTA CCAAACTCAT TAAgrTGGTT gTGtTGGACT 
GCCtATTATC mAAGtATTAT GaGTTGTTTA ATATTAGtGC TAArACATAC GAAGAGTGGT 
TTAAACAATT TAGTAGTAAG AAAGCACAAT TCAGTATTAA TCTCACGGAT AAATGGATAA 
TTCAAATCGC ATATGGTAAA TTAATAATAA TGGCTAAAAA TAATGGCGAT ACATATTTTA 
GAGTTCAAAC AATTAAAAAG CCAGGTAATT ATATTTTTAA CAAATATCGA TTAGAGATAC 
ATTCTAATTT ACCAAAATGT TTATTTCCGC TTACAGTGAG AACACGACAA AGTGGCGATA 
CATTTAAACT GAATGGGCGC GATGGTTATA AGAAAGTGAA TCGCCTGTTT ATAGATTGTA 
AAGTGCCACA GTGGGTTCGG GATCAAATGC CAATCGTATT GGATAAACAA CAGCGCATTA 
TTGCGGTAGG AGATTTATAT CAACAACAAA CAATAAAAAA ATGGATTATA ATTAGTAAAA 
ATGGAGATGA ATAGCGTTAT GCATAATGAT TTGAAAGAAG TATTGTTAAC TGAAGAAGAT 
ATTCAAAATA TCTGTAAGGA ATTGGGAGCA CAATTAACAA AGGATTATCA AGGTAAACCA 

r 

TTAGTATGCG TGGGTATCTT AAAAGGCTCA GCAATGTTTA TGTCAGATTT AATTAAACGA 
ATTGATACCC ATTTATCAAT TGATTTCATG GATGTTTCTA GTTATCACGG AGGCACTGAG 
TCAACTGGTG AAGTTCAAAT CATTAAAGAT TTAGGTTCTT CTATTGAAAA TAAAGACGTA 
TTAATTATTG AAGATATCTT AGAGACTGGT ACTACACTTA AGTCAATTAC TGAATTATTA 
CAATCTAGAA AAGTTAATTC ATTAGAAATA GTTACTTTAT TAG ATAAAC C AAACCGTCGT 
AAAGCGGACA TTGAAGCTAA GTATGTAGGT AAAAAAATAC CAGATGaATT TGTTGTTGGt 
TACGGTTTAG ATTATCGTGA ATTATAC CGA AACTTACCAT ATATCGGTAC GTTAAAACCT 
GAAGTGTATT CAAATTAATT TTTTAATCAA TTTCAGTTAT TATTACTATG CGTTTGAGAA 
ATAATAGTGT AGACTCAAAA ATATGAAAAA TGTATTTCAT ATATATTTAA TTTTAGACAA 
GACATATGTC TTGAAAAGTT GAAAAATATA GAGATTGATA AAACTAATAC GGGTGTGAAT 
GACATTGATG TTAAGCTCAA TTACTAGCTT ATAAAACATG TCATATGTTA CAATTTTTGT 
TAGTTTTATT ATGGGAAGTA GGAGGAAATG ACGCATGCAG AAAGCTTTTC GCAATGTGCT 
AGTTATCGTA ATAATAGGCG TTATTATTTT TGGTCTATTT TCATATTTAA ACGGTAATGG 
AAATATGCCG AAACAGCTTA CATATAATCA ATTTACTGAG AAGTTGGAAA AAGGTGACCT 
TAAAACTTTA GAAATCCAAC CACAACAAAA TGTCTATATG GTAAGTGGTA AAACGAAAAA 
TGATGAAGAC TATTCATCAA CTATTTTATA TAACAACGAA AAAGAATTAC AAAAAATTAC 
TGATGCTGCT AAAAAGCAAA ACGGTGTAAA ATTAACGATT AAAGAAGAAG AAAAACAAAG 
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TTTCTTCCTA AGCCAAGCAC AAGGTGGCGG 
ATCTAAAGCA AAAATGTACG ATAATAATAA 
5 GGCAGATGAA GAAAAACAAG AATTAATTGA 

ATTCAAAGAA ATGGGATCTA GGATTCCTAA 
TGGTAAAACA TTACTTGCTA GAGCGGTTGC 

10 

TAGTGGTTCA GACTTTGTAG AGATGTTTGT 
ATTCGATAAT GCTAAGAAAA ACGCGCCTTG 
TGGTCGTCAA CGTGGTGCAG GTGTTGGTGG 
CCAATTATTA GTTGAAATGG ATGGTTT CGG 
TACAAACCGT CCTGATATCC TTGACCCAGC 

20 AATTCAAGTT GGTCGTCCAG ATGTGAAAGG 

AAACAAACCA CTTGATGAAA CGGTTGATTT 
CTCAGGTGCT GATTTAGAGA ACTTATTAAA 

25 TAAAAAGAAA ATTGACATGA GAGATATCGA 

TGCTAAGAAA TCTCGAGTTA TTTCTAAGAA 
TGGTCATACA ATTATCGGTA TGGTACTTGA 

30 TGTTCCACGT GGACAAGCAG GTGGTTATGC 

AATGACTGAA CAAGAGTTAT TAGATAAAAT 
AGATATTAAC TTTAACGAAG TATCAACAGG 

35 

AATCGCACGC TCAATGGTTA CGCAATATGG 
CGGTCATAGC AATGGTCAAG TATTCTTAGG 
AAGCCAAATC GCATATGAAA TTGATAAAGA 

40 

ACGTTGTAAA CAAATTTTAT TAGAGCACAA 
ATTAACAGAA GAAACATTAG TTGCTGAACA 
45 ACCTGAAATT GATTATGATG CAGCTAAAGT 

TGGTAAATTC GGTAAATCTT ATGAAGAGAT 
TGACGAAAGT GAAGATCGTA AAGAAGAAAA 

so 

TAAATCTGAT GAAAAAGATG AACCAGCACA 
CGATCCAAAT CACCCAGACA ATAAATAATC 

55 



TAGTGGCGGT CGTATGATGA ACTTTGGTAA 18 00 

ACGTCGTGTT CGTTTCTCTG ATGTAGCAGG 1860 

AATTGTTGAT TTCTTGAAAG ATAATAAAAA 1920 

AGGTGTCTTA CTTGTTGGAC CTCCAGGTAC 1980 

AGGTGAAGCT GGCGCACCAT TCTTCTCTAT 204 0 

TGGTGTTGGT GCGAGCCGTG TTCGTGACTT 210 0 

TATCATCTTT ATCGATGAGA TTGATGCTGT 2160 

CGGTCATGAT GAACGTGAAC AAACCCTAAA 2220 

TGAAAATGAA GGTATCATTA TGATAGCTGC 2280 

CTTATTACGT CCAGGTCGTT TTGATAGACA 234 0 

CCGTGAAGCA ATTCTTCATG TTCATGCTAA 24 00 

AAAAGCAATT TCACAACGTA CACCTGGTTT 2460 

TGAAGCATCT TTAATTGCTG TACGTGAAGG 2520 

AGAGGCAACG GATAGAGTTA TAGCCGGACC 2580 

AGAACGTAAT ATTGTTGCTC ATCACGAAGC 2640 

TGAGGCAGAA GTAGTGCATA AAGTTACTAT 2700 

AATGATGCTA CCTAAACAAG ATCGTTTCTT 2760 

CTGTGGTTTA CTTGGTGGAC GTGTATCAGA 2320 

TGCTTCAAAT GACTTCGAAC GTGCAACACA 2880 

TATGAGTAAA AAATTAGGAC CATTACAGTT 294 0 

TAAAGATATG CAAGGTGAGC CTAATTATTC 3000 

AGTTCAACGA ATCGTTAAAG AACAATACGA 3060 

AGAACAATTA ATTTTAATTG CTGAAACATT 3120 

AATTCAATCA TTATTCTACG AAGGTAAATT 3180 

TGTTAAAGAT GAAGATTCTG AATTTAATGA 3240 

TCGTAAAGAG CAATTAGAAG ATGGACAACG 3300 

AGATATTGCT GAGGATAAAA AAGAAGCTGA 33 60 

TCGACAAGCC CCAAATATCG AAAAACCTTA 3420 

GATTATATTC AGTACCTCTT TCTATGATAA 34 80 
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GTTGCTATAG AGCCTGAGGC TTCTCCAGTA TTGAGCGGTG GTGAGCCAGG TCCACATAAA 

TTACAAGGTT TAGGTGCTGG ATTTATTCCA GGCACTTTGA ATACAGAAAT CTATGACAGT 

ATTATTAAAG TAGGAAATGA TACAGCGATG GAAATGTCTC GTCGAGTTGC TAAAGAGGAA 

GGTATTTTAG CAGGTATTTC ATCAGGTGCT GCGATTTATG CTGCCATTCA AAAAGCAAAA 

GAATTAGGAA AAGGTAAAAC AGTAGTAACA GTATTGCCGA GTAATGGTGA ACGCTACTTA 

TCAACACCTT TATATTCATT CGATGACTAA TTAATGTCAT TTAAAAGAGT GAGTTATCTT 

TTTGAGATAA CTTGCTCTTT TTTTCTACCA TGTATATTTT TAAAAATATG AGCGTTAAAT 

TAAACATTTT TCTGATAAAA ATATCCAGTG AATGATAAGA TAATAAACGT ACATACTAAT 

AACTAGTAAA TAGCAGGAGT AAATTTTATT AGAGTTAAAC AATACATAAT TAAAGGGTGG 

TTAACATGAC TAAAACAAAA ATTATGGGcA TATTAAACGT CACACCTGAT TcATTCTcAG 

ATGGTGGAAA ATTTAATAAT GTTGAATCAG CTATAAATAG aGTGAAAGCC ATGATAGATG 

AAGGTGCTGA CATTATAGAT GTTGGAGGTG TTTCAACGAG ACCCGGTCAT GAAATGGTTT 

CATTAGAAGA TGAGATGAAC AGAGTATTAC CTGTTGTTGA AGCTATTGTC GGTTT 

(2} INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 01 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6115 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
TAGATACTGG GnTAAAcaTc AAAAATAtyT GCtTaTTCaC GTGTTTAcGc TCCCtCAAAC 
GCAACGTTAA TTGCGTGTAA TCATTTAGTG TGAATTcAGA CGCTTCTTCC ATGACTATGT 
CTGATATGCC TTTTATCGAC TTTATTTTCT CTGGGTTATC TAATCCTTTA AACAAAAAAA 
CTGCGCCGTT TGGCAATTCA ACTTTGTTAT CAGTCTTATT CCAAAGGCAC ATGTCCCAAA 
TACCAAAGTT TATCAAACAA TCTTTAACAT CTTCGAACAA ACTATCTTTA ATTGTTGATT 
GTACTTTTCT AAGCCACAGT ATACGCCTAG GATATTTCCA ATCTTGCAAT GCTTTGAGTA 
CAACTTTTTG TATAACGCCG TGAGACTTAC CGCTCGAACC TCCACCGTAA TGJcACTTCAG 
TGAAGTtATC GTAATTGGTT AGTATTTCGA ATATGTTTCT ATTGAAAACA TTAGACGGTT 
TGTTAAAGTT TAATTTAACT TTCGTCATCG TACTCACCAA TATTAATCTC AATATTCTTC 
TGAGTAATTT CTTTTTTATC GATATACGCA CCATGTACTT TTAGTATGTG GTCAATAGAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
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TTTAAATGGT CATATTTCTT ACTGTAAGCC TCTTGAGGTT CTCCTCTAGC AATAGAAGCA 
GATAACGCTA AAGCTTCTGT AATACTCATT AAACGCTCTT CTTGTATCTG TTCTAATCGT 
TCTTTAATAT ATTCCGAAAC ATTAACATTT CTTAACAATC GACTTGCTAA AGACTCTGCT 
GTTTTCTTAC TATAACCTGC TGTAATTGCT GCTTTTTTAC CATTACATCC ATTCATTATA 
TATTCATCTG CGAATCTCTT TTGTTTTTCG TTCATTTCAT TTACCACCAA CTCTCGCGCT 
ATACGCTTTT TAAAATTAAA AAAGGATTGG CTATAATCAG CCAACCCACA TAGATCCTTT 
ATTCCTAATT GCGATAAGGG AAACGCAGTA CGATAGTCAA TATCCTACAC TATCATAATA 
TCTCATTTAA GGTATCAAAA ACTGCCACTT TACTGCCAAT TTCAGTCTTC CCCTAACTCT 
TCCGCCAATC TAGATATGAT TTTTCTTTTG ATTCTATGAG CAGTTCTATC AGAAATGTGT 
ATGTCAACAC AAACTTTCAC TAATTCCTTT TTATTAAAAT AATACTCTTG AATGAATTCG 
CGTTCTTTCC TGCTTGATGT GTTGATTATA CGTTCAATAG CGCTCTTAAA CTCAAGGATT 
TTACCTCTTC GTATACTACA AAGATAATTA GTTACTGCCA TTTCTGTTTT CGATGTATTA 
GACGGTACAA ACTCCCCGCC TATATTTGTA TCTGTTGGAA TCCACGGTGT CATTATTTCA 
CTTCTTAAAT CTTCAAGTTG TTTATGATAA TTAGGATAAT CACACAACTC ATCTTCTAAC 
TTTCGAACTG TTGATAATTT TAATCCGTAT TTCTTTTTAG TCATGAATAC CCTCCGTACA 
AATATGTTTA ATCTTCAAAG TGTCTCAATC TACTTCTTAA TATCTCTATC TCTCGCTCTT 
TAACTTTTAC ATCACCTTTT AACTGTTCCG CTTGTAACAT CACACCAAAC AATAAGATGA 
CTAGTAATAT AATTGCTATG ATTAACCACA TCATCTACTC CGACACCTCC GCCCTCATCA 
AATCAGACTG ATCACTCAAC TTTGCGAAGT CACTTGGCGC CTCTACATCA TCATTAGCCG 
TCATCATAAT ATATACTTGC TCAGTTACAT ACTTACCTAA CTCATACATC GCTAGTAAGA 
ATAATAGTCT CAAAATTTCT TTAACCACCA CTAAACACCC CATGTTAATT TATCGATAAT 
TTGTATAGCT TGTTTTAATG CGTCTCTTTT TTCTTTGATA TCTCTATTAT CGCCATCTTC 
ATCAGCTGAC ATTAACTCAC TGTCATATTC ATATAATAGT TCTGATATTT CATTACTAGC 
TACTACTAAT AAGTTTTCAT CTACATCAAT CGTTACCGTT TTCTTTGGCA TCTCCATCTC 
TCCTTATCTT AACTTGTGCC TCGTATTTGC GCTCAGCTTC TTCTTTACTC TCTGCCTCAA 
CAACTGTAAA CGTCTGATTA TCTCTAGCAG TAGTAAAATG TTCATGTGGT TGTCCTGTTG 
AATCTTTGAA TGTTGTGACT AAGTATTGCG TCACTTCTTA TCACTCCTTT GAATGATTCT 
50 AAGTTTTTCT ACGAATAAAA GTATTAGTAC AACACTCAAT GTAGCCAACA TATTTTTTTG 

CTTTGCAAAA TCTACTATAA CGATTAAGAC TAATAACATT CCAATTCTGC ATGTAAATAA 
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720 
780 
840 
900 
950 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1630 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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TACAAGTATT GGAACTAATG TAATGATGTA ACTCACTTCC CCAAAACCTC CTTGACTCGA 2520 

TCTAAGATGT CTTTACACTC CGCTACTTCC GAAGCCTTTT TCTCCACGTT CTGAAACACT 2580 

TTCGAATTCC TCCACTTGCT TTAGTTCAGG TGTCCATATA GGCACGATAA CCAATTGAGC 2640 

TAGTTTGTCT CCTTCGTTGA TTTGATAAGT TCCGTATTGT CTTATGGCGT CACTCAAATC 2700 

GATTTCTCCT TTAATATCAA AAACACCTGG TGTGATATAA CCATTCGATG CAATAGCGTC 2 7 SO 

ATTCTTGATA TTAATCCCTA AATTGCCGTG ATATCCCGCG TCTATCTTGC CTGTTTCAAT 2820 

CACTAAATGC GTTTTACTAC TTACACCACT ACGGCTAGTT AATAGTCCGA CATAGCCCTC 2830 

TGGTATGCTT ACAGCTACAT CTGTTTTAAT CACTGCCTTT TCTTGTGGCT CAAGTACGAC 2940 

AGTTTCAGCT GAGAATATGT CATAACCTGC ATCCGTCTTA TGATTTCGTT CGGGCATTCT 3000 

AGCATTTTCT GATAATAGCC TTACTTGTAA TGTGTTAGTC ATTTTCCTGC TCCTCCCTAG 3060 

CTGTAGCAAA CGCTATTCTC AATTTCAATC TTTCAACAAT ATGAATTAGT GCGGTATTGA 3120 

GGAATATTTC AAATTCTTCA ATGTTCTCAT CTATAAAATC AAGTATTTCT TCCTCTTGTT 3180 

CACTGTCAAA CTCGCTTAGT ACATCCCAAA TATTTATGTC GCTTTTGCTC GTTTCTAATA 324 0 

25 CTCTTTTGAT TATTTCTGAA TTACTTTTAT TACTCATTTT CCTTGTTCCT CCTCATATTT 33 00 

ATAGACAACT TGACCTGCCA TAATCCCTAC TGCTTCATCA AGTTCAATAC CTTCTTTAAC 3360 

TGAATGTTGA ATAGCATTTG TCATTCCCTC AAGTATTTCA TCAAACGCTT GTGCTCTCTT 34 20 

30 ATACACGTCC TCAATCTCTT TTAGTAATCC CTCTGTGTCA TTACCGTTAT ACGCACTAGC 34 8 0 

ACTGATCACT GATTGTTCAA TTTGTTCGCG GTTATTCATC ATTTCCATCT CCTCTAAAAT 354 0 

AAAGTTAGTT GCTTCTGCTC CTCGTATTCC AAACCATGTT GCTTTATATA TGTTTCGAGC 3600 

TCTTCCGCTG TATCAAATGT CTTTTTCACG CCTTGCCAAC CTGGCACGAT ATGCCCATGa 3660 

AAGTAATAAG TGCCGTTCAC TACATGGATA TGTGCCACTC GTTCGTTATC CTGATACAGA 3720 

TATCTCTTAG ATCCGAAAAA TTGGTTTAAG TATTCTTTAC ATGCGCTATC GGTTTTAGGC 3780 

ATTTATGCTT CCTGCCATTT CTTAAACATT TGGTTATAAG TAGTATCAAA CCAGTACGGA 384 0 

TCACGTGAAT GTTTTTGAGG CACATTAAAC AAATGTGGCT TCTTCTTACG TAGTTCAGCC 3900 

TCTTTACGTC GTTGCCTAGC CATTTCACGC TCTTTGCTCT CTCGCTCCAT GATTTTGGAT 3960 

AACACAATTT CTTTATACTC AGCTAAGCGC ATACCATAAG GTGCATGTAA GGCT7CTAAC 4020 

AACGCCCAGC CACCTCGTAC TCTTTTTGCA ACCATTCCTG GAGTTAAACC GTTCTTTTTT 4080 

50 ATCAATTCAT TTTCATGTTC GGTAAATTTA TATGGTTTAc CGTTAATCTT TACGATACTC 414 0 

ATTTATTCCA CCTCTATACA TTTACTTTTT TTAATCCAAT CCTCTAATTT GTGCGTGTTG 4200 
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ACATTTAAGT TAACCATCTC AGCTTTTCCG TTTTTATATC CACTAATAGT TGATCTTGAT 432 0 

ACGCCAGTTT CATTGTGCAA ATCTTGGACA CTTACGTTAT CTCTAGCCAT GATTACCCTT 4 38 0 

AAATTAGTTG CGAATACTtC GTTCAACTTC ATTTATTCCA CCTCTATATA TGCATGTCTT 444 0 

ATTGTTATGT TGTCATACTT TAGTAATTCG TCCGGATTGT CATCTAAGCG CTTTGCCAGC 4 500 

GTATCTTTTT CTTTATCCAC ATCATCGTAA TGCTGATATT CAACTTCTGT AGGTATTCTT 4560 

ATATCAATCG TTGCGTTTAT ATATGCTTGT TGTTGCATTA GATCACTTCA TTTCTCTTTT 4620 

TCTTTTACGT CTGACTTTCA CTAAGTCCTC ATATACCATC CATTCTTGAC CTGTGTATTT 4680 

AGGCGCTTTA CATATCCACG TTAAATTCAC ATCTCTATAC TGATATCTGA ATATCTTCGC 4740 

TTTGATGTTG GCAACTTCAG TCGCCTTACC TTTAACGTCT ATAACTTCAA CCAGTTTCCC 4 800 

TTCCTTCCAC AAAGAGAAAT CGGCTATATA CGTAATCGGT CTTTGTTTCC CGAATTTAGG 4 860 

TTGTAATTCA AATTTCGGTT GTATTTCGAT ACGATCATAG TTAGTGCCAT TCATATTACT 4 920 

TTCTAAATAT TGGTAATATT CGCACTCTAC TTTGCTATCA AATACAATTC CTTTGTACTC 4980 

AACTTTCTTA GCATTGTATT TACTCATTGT GCCACCTCTA AATATCAAAT ATCGTTGCTT 5040 

25 GCAATCCTAG CTCTTGCTCA TATAGAAGCC CGTGAGCGCC TTTGAATCGT TTTAGGTCAC 5100 

TATCAGTCAT AATTTTCTTT TCGTCGCTGA AATGGGCTCC TGTGAGCGAA TAAACTTCAT 5160 

TTACGTTGTC TTTATACTTG ATGACCTTAA TATCTTCCGT GCCATCTTCT CGGTATAAGT 5220 

30 AATATTTTTC TTTCGGCATT TTTTAACACT CCTTAATGTG TGTTTTCTTC CAGTTGATTT 5280 

CA7TCATGAT TTTCTTTTCA ACTCTGTCGT AATCATCGAA AGGCGATAAC TCGTTATTGT 5340 

CCAACAATCT ATTGACCGCC CAACCAGTCT CGATATATAC ATTTG CTACA ATCGGGTCGC 54 00 

TTTGCTTTGT CTCTTCATAC ATCGATTTCA ATAAGCTTTT GAATTGCATT ATATTCATGT 54 60 

GAAAAACCTC TGAGTCTTCT TGTAATACTC AAATTCAATT ATTCCGGTTT CGCCGTCTTT 5520 

GTTTTTGGCT ATGTTACATT CAACAATAGA TTTGCCAGTG ATACTGTCAT CTTCGTCACG 5580 

GTTATAATAA TCATCACGGT AAAGTAGCAT CGCTAAACTC GCATCTGCTT CTATTCCGCC 5640 

TGATTCTTTC ATGTCCGATA GCATTGGTCT TTTATCCTGT CTAGACTCGA CACCACGATT 5700 

CAGTTGTGAA AGTAGTACGA TGATTGCGCC TGTCTCGTTA GCGATTATCT TTAAGTCACG 5760 

TGATATCTTT TCTACTGCTA CACGTCTATC AACTTTCGCA TCAGTATCCA TCAGTTGAAG 5820 

ATAATCTATA AAAATAACTT GTTGCCTGTC TGAATGCCTC ATTGtTGCGC TCGCACATCT 5880 

50 TGCGGTGTGA TATTACTTTT ATCAGAAATA TCGATGCCTA ATTTCATGAT TTTATCCATC 594 0 

GCATTCGTTA ACTTTGTTAA GTCATCCGGC GTTAAGTTCC TGATTTCTTT TATCTTTGTT 6000 
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AGACTAAAGA 


AAGATGTTTT 


GTATCCATTT 


TGTGCTATGT TCAGCATCAT 


GTTTAATGCA 


5120 




AAACCTGTCT 


TACCCACTGA GGGACGCGCT GCGATGACGA TTAATTGTGA TGGTTCTAAT 


6180 


5 


CCCCCTATTT 


TGTAATCCAT 


TAGCTTGTAA 


CCCGTCTTAA 


TTTGCTTCTT 


AGGGCTATCG 


6240 




CTGTATAACT 


CTTCGACAAA 


CTCCTCAACA 


AACTTCTTGG 


TTCCATCTTC 


TTTTTTGTTA 


6300 


10 


GTAATTGTTT 


TTAAATCCTT 


GAGTTCATCA 


ATCAAGTTGT 


TAAAGTTTTG 


GTTCGTAGGT 


6360 


TGTTGTTTGA ACTCAGTTAC 


CAATTCGTTA 


GCTTTGTTGA 


GCTGATAACT 


TTCCAATAAT 


6420 




TCTTGTTGAT 


AACGTTCAAA 


GAAGCCATAT 


CCAATGAAAT 


CGGAGTTGTA AAGTTTAGTT 


6480 


15 


ATAGTATCTG CATCTAAAAA TTCTTTATCT TTAGTTGCTT TTAAATAGAT TTCTTGATGA 


6540 


TCTATCTTTC 


CGACGTCCAT 


TACATAATTG 


AAAAAGGTTT 


TAAACTTTTC 


GTTCGTAAAC 


6600 




ATGTAATCTT 


TAACTCTTAT 


CTTTTCTAAT 


ACGTCCGGTT 


GTTTAAGTAG 


CGTAGCGATT 


6660 


20 


ATTGTACTTT 


CAATTTCGAA 


TTGTCCGTAA 


TTCATTCGTT 


TTCGCCCCCA 


AATTCTGCCA 


6720 




ACTTATTCAT 


GAACTTATCT 


AGCGCTATTT 


TTCTTTGTCT 


GACATATTCG 


GGGTCATTCT 


6780 




GCATTTTCCA 


TTGGTGTGTA 


GCGGTTTCGT 


TATCTACTGG 


CTCGATAGAT 


ACTTTTTTAG 


6840 


25 


GTTCCTTACG CATGATTGCT GGTAAGTTAG GCGGGTACGG GTTGTTACTG 


TTGATATAAA 


6900 




CATCTACCGC 


TTTTACAGTT 


GGTTGATAAT 


CTCCATTTTG 


ACTTAATACA 


TCAATCCACA 


6960 




TTTCTAACTT 


CGGTTTATCA 


AAATCAATGT 


TGTATACGTA 


CCTAACTTTT 


TTAATAATTT 


7020 


30 


CTAATGCTTG 


TGTTTTGCTC 


ATCGGCATTA 


GTCATCACTC 


AATTCTTTTT 


CCATTTGTGC 


7080 




AATGACATCA 


TCAGTAGTAT 


TTTTTCTAGG 


TGCTATTTTA 


TTTTCTGCAT 


CTTCTTTTGT 


7140 


35 


TTTGACATTC 


TCTTTAGCCC 


AGTTGTTTAA 


AACTTTAATT 


AAATAG CCAC 


CATGCGCACT 


7200 


TTTGCTTTTA 


GTGTACTCAA 


CACCTACTTT 


TACAACTTCA 


AAAGCGTTTG 


TACCTATATC 


7260 




atcAtagca AACCCTAATT GTTCCATTTG 


ATTAGGTGTT 


AACTTATCAT 


CCAAATTTGC 


7320 


40 


AATTATATAT 


TTTATTGAAG 


ATGAGAAGAC 


GGCTTCTCTT 


TCTTCTTCTT 


TATTCTTATA 


7380 


TTCTTCTTCT 


TTTTCTTCTT 


CTCTTTCTTC 


TTCTTCTTCT 


GTATCGTTAC 


GTAACGTTAC 


7440 




GGTAACGTTA 


CGTTTTGCTT 


CTAGTAACTT 


TTTCTGTTTC 


TCACGATAGC 


GTTGTTGTCG 


7500 


45 


CAATTTATTT 


TTTTCTTTAT 


GCTTAGCTTT 


GCTATCTAAG 


CTTTGATGCT 


TCTCCCAGTT 


» ^ v V 




TGTCACTTTT 


ATGACACCAT 


TAACTTTTTC 


AATCATGCCC 


AATGTCTCAA 


AAGTTTGAAT 


7520 




TGCTAACCTT 


ATTGAGTTAA 


TAGGTCTATT 


AAATTCATTT 


GCTAACATTT 


CTTCGTTGTA 


7680 


50 


CGGCAAGTTT 


TCGGATAGCA 


TAATATAACC 


TTGTTCATTG 


TACTTTCCTG 


ATAAAGTTAG 


7740 




TAACTTAACC 


CAAATAGTTA 


TGATCGTATC 


TCTTTCGGGT 


AAAGCTT CGA 


TATATTTGAT 


7800 
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CTCCTTTCAG CATTTTGTTG AGCCTCTCAT CAACTTTTAT CCACGAGTCA TGCAAGTGAT 7920 

ATTTATCATC AAACGACTTA ACGCCAATTG CGTGCTGTTC ATTATGATGT TGTCTACACA 7980 

GTGCTAACAC ATGTTTGTCG TAGTGATTCA TTTTGTTTCT GTTCATGCCT CTGCCGACTG 8040 

CTTCATAATG TGCCAGGTCT GCGTGAGGCT TTCCGCATAT TACACAGTTG CGGTTGATTG 8100 

TAGCCCAATA TAATAACGCT TTATCTTCGC TTAACAACTT ACTCGTTTCT ACACTCATAG 8160 

GTATTTGATG ATGAAACATA AACGCTATAA TCAGTTCTAT TAACTCCCTT GCAACTTTCA 8220 

TAGAACAGTC GCGCAGACTG ATTTCTTCAT AACCTTTCAT AATTTCCAAT TCTGTTTGTA 8280 

ATAATTTTCT AGTTGATTCT ACTGGTTCGC CCCAGTGAAG TTCTATATCT CTACACATTG 8340 

CGAATATTTT TTTGCGTTGT TCTATAGATA GTTTTTTATT GTCCGGAACC TCTACTTCTG 8400 

CTTTTAGTGG ATATCCGTTT TCTAGTAAGT CAATGTGACT TTGTTCAAGT TCAACACCAG 84 60 

20 TAGCAACGAC GGAATAAGTA CCGTCATTGT CTTTCTGGTA TCTTGTAATG TATTGCATTT 852 0 

AAACCACGTC CTAGAACGGT AAATCATCAT CATTGATTTC TATTGGACCA TTAGCATTAG 858 0 

CGAATGGGTT TGATTGTTGA CTCATTGGCG TCTGTTTCCC ATTTGCTTGC TGTTCTTTTT 864 0 

25 GTTTCATCTC ATCAGTTTTA GGTTCTGGTT TATTAACTAC TTCATCGTCT TTATTCCAAA 870 0 

CTTTTACATA TGAGAGTCTT ACAAAATACT TGCCTTGTTC CTCGTTAAAT TTATTTTTAA 8760 

GTACAATAGT TCCGATTTTG TTAATTAATT GATCTGTGTC AAAAGTTAAA TCTGGTAAGT 8820 

TCAATTTAAT TCCTAATCTA CTAAGTAACT CGATATATTG TTTTTCTTGA TAATCTTGTT 8 8 80 

GGAATGGTGG GACGAATTGG TTGTGTTTGT ATTGTTTACC TTCGTTGTTT TCAAAAACAA 8940 

TCGTGAAGTA TCTGTTTTCT CTGTCGTTAA ACT CGACATT TGCAACTTTT ACTGTAAATT 9000 

CTCCAGCTCC TAAAAAGTCC CCACCTTTCA TGAATGCCTC TTGATTAGTT TCTTGAATGT 9060 

ATTGTGTTCT ACCAGTGATT TTCATAATTT TTATACCGTC CTTTTAATTA ATTTTTAATT 9120 

ACCATTTCTA ATTGCTTGTA CAACATCGTT AATACTTGGA TTAATGAAAC GTTTGTTGTT 9180 

AATTTTGATG TTGCTTGAGT GTCTTATCTT TGTCTCGAAT AAATTTGATG GTTCAGCGTT 924 0 

AAGTACATAT TGATAAGTTT TTTCGCCGTC TTGCTCATGT TCTTCTATTG TCATTCTTGC 93 00 

45 TAACACGTCA GATTGACTGA TGACTGCTTT TTTTATTTGG TCTTGTGCCT CTATCGTGAT 9360 

TGTTGGATTG ATAGTACTTC CCTCATCATC TTTGTCTTTG TTAATGCCCT CGTGTCCGCT 942 0 

TATAGCAAGA TGAAATTGAT AATGTTCTTG TAATTTAGAA ATATAACGAT AAATACTTAC 94 8 0 

50 AATGCGTGTA GCACACTCGC CCCAATCATT AAATGTCGGT TTCTTTGATT TACCGTCCAT 954 0 

GATGTCGTCC ATAGTGATAT CACGTAACTT TTGGATTGTT TCAATCACTA CAACATCAAT 9600 



30 



35 



40 



55 



766 



EP0 786 519 A2 



10 



15 



20 



25 



30 



AAAATGCTTA TAATTCTTAA TCTGCACAAC TGCCCCATCT TCTGTTACCG TTGTTCCGTC 9720 

CTCATTTATA TCTAGTACTA AGGCATTGTT ATCTTTTGTT AAAAACGTAG TTTTACCAGT 9780 

ACCGAACTTG CCGTATATCG CAAATTTATA AAACTTGTTT GCATTTTGTT TGCTGATGTC 9840 

TTTTACACCT AGTTGCGTTA AAATATCGAC ATCTTGATTA GTTTTTTCAG TCATCTATTC 9900 

TCCCACCTTT ACCGTGTATG ACGTTGGTTT CTCCACAATG CTAGCACCCT CTAAAACTTC 9960 

GCCGTTTGCG TCAATCAATG TGCCGTTTTC AGTTACATTG AAATCTTTCT TAATGTCTGA 10020 

TTGGCTAAGT TTTTTAGTTA CTTTTACATA GTTGTCAAAA CCTCGTTGCT CAAGTTGTnT 10080 

AATGACTTCT TGCTCATTGC TAACTTGAAT GACTTTTGAA CCTTTTCTGG CTGTCACTTT 1014 0 

TCCGTAAGtG TATTCAACTT GAATTTGCTA TCTTGTTCTT TTTGTATTCT GTAATATTCA 10200 

ATTACAAGGC TTTGTAAATA TTCTTTGCCA CTCTGTAATT TTTCTACTTC TTTATCTTTC 10260 

CATTCGTTTA TGCGTTCAAT TTCTTTATTT GCTAAATCGT TGATTTCATT CTCTTTAGTT 10320 

GTGATTGCAT CCAGTTTCTn AAAAACCCAG TTAGCACTGT CTAGATCAGT nACTTTGAAT 103 80 

CGGTCGTCTT GTTCGAATGT n 10401 
(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2989 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 



ATTGTTTAAA 


TTTAAGTTTA 


TAGTAATGTT 


60 


GTGTTGTATA 


TATAATCATT 


CATCTAGTTA 


120 


CGATGCAATT 


CATTGATGGA 


TGTTTTTAAT 


180 


TTAAATTCAC 


TTTCTTTCGA 


ATCGATTTTT 


240 


ATTAAAGCTC 


TCTGTCATAT 


CTATTCCCAT 


300 


ATTATCACCT 


AATTCTGCTT 


TAATCGTATT 


360 


ACCATAGGTA 


TGATTTATTT 


CACGTGCAAG 


420 


GTTAGCTAAT 


GGTGAAAAAT 


ATCCTGTTTT 


480 


TTGCATTTCT 


ACCATTGATT 


TCATTTCTAC 


540 


ATATATCAAT 


TTGAAGTCTC 


ATGCATGTTT 


600 
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AATTCAGTTT ATATAAATGT AATGCATTCC TAACTAAATT AAATCAATTG AAATTGGGAT 720 

TATAACTTTA TGATACGTAC CACTACAATA AAATAATATA GTGAATAATC TACCATTAGA 780 

5 AAAATAAGCA CAAAAAAACT AGCAACCACA CAAAAATGTG ATTAGCTAGT TAATAAGTGT 84 0 

CTAATTTAAG TTAATTGTTA ATCTATAAGA TTAATCACTT GAACGCGCAA TCAAAATAAT 900 

ACGTACAAGC TCTGCTACAG CGACTGCAGT TGCTGCAACA TAAGTCATTG CTGCTGCAGA 960 

10 

TAATACTTTA CGCGCATGCT TGTATTCTTT TTCATTTACA ATGTTCAATG CCGTAATTTG 1020 

TTTCATCGCT CTTGAACTCG CATCAAACTC AACTGGTAAC GTAACAATTG AGAATAATAC 1080 

CGCTAATGAC ATTAAACCAG CACCAATCCA TAAAGCAGTT GAACCaAATG CACTACCTAT 114 0 

75 

CGCTGTTAAG ATAATACCTA ACATGATGAT CATATAACTT AATGAACTCC CTAGGTTTGC 1200 

AACAGGTACT AATGCTGCTC TGAATCTTAA GAACCAATAT CCTTGGTGAT CTTGAATGGC 1260 

20 ATGACCAACT TCGTGGGCTG CAATTGCAGT TCCAGCAACT GATGGTCTGT CATAGTTTGC 1320 

AGGAGATAGT GAAACAACTT TCTTTTTAGG ATCGTAATGA TCTGTTAAGA ATCCTTCACC 1380 

TTTAACAACT TCGACATCAT AAATACCGTT TGCATGTAAA ATTTCTAATG CAACTTCACG 1440 

25 ACCCGTTTTA CCACTAGTTG ATCTAACTTG TGAATATTTC TCATAGTTAG ATTTAACTTT 1500 

GTGTTGTGCC CATAAAGGAA GCACCATTAA TATTACGAAA T AAATT AT CA TAGTAAAAAT 1560 

TGAAGACAAT AAACTCACTC TCCTTTATAA ATATTTTACT GTCATTTGCC GTTTTTATCA 1620 

AATCATTTAC ACTTTAATAA TTTGTTTAAT TCAATATAAA GCAAAAGTCC AAAAACACTT 16 80 

AGACAACATG ATAATACACC AATTTGCCAC ACATGTGTAG TTATAAAATC ATAATATGGA 1740 

AATTGAAGGT GAAAATAGTC AATATAATCA TTCAAAAACA CCCAAATCAT yGCTACACTG 1800 

35 

ATTCCAATCA TAGAACGTTT AAACCTAGGA TAGAAGTAAA TTGCCTGAAC AGCCATTATA 1860 

CTGTGGGAAA ACATTAATAC CAAACCATTT ACTGTAATAT CACCTTGTTC AATAATAAAT 1920 

AATATATTCA TTATAACTGC CCAAATCCCA TATTTGAATA ATGTTACAAA TGCCAGTGCA 1980 

40 

TCGATAATAC TATTTTGTTT TTGAATTAAT ATCAATGAGA TAGAAATAAC TAAGTATAAT 204 0 

ATTGCAGTTG GGCTATCTGG AACAAAAATC TTAAAATGCC AGGGCGTATG ACTTAATTGT 2100 

45 TCACCATACC AT ATATAAC C ATAAATCATC CCTAATATAT TACAAATGAG TAGCATCATT 2160 

AACCAAGAAC GTTGATAAAG TGTATATTGC CAAAATGCTT TAATTGTCAT CTGCTAAGTC 222 0 

CTCAAATTGA TTATGTTTAT TTACTAGCTT GAGTGTATTT AAAATTTGCG TTAGTTGATA 2280 

50 AAAACGTTGC TTTTCATTCA TCTGTAAACT TAAATCAATA TTGTGTAACA AGTAATCTAT 234 0 

TAATAACGCA TGTTTATGCC GATCTATAGC CATACTATTT AAGTCATGAA GATAAGTTTG 24 00 
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TGACACGTTT GCGAAGTGAA TTTGAATATC AAAAGCACAG TTATGATTAG CGATATAATC 
AAATATTTCA TTTGTATTCA TTAACTTTAT ATTACGCTTA GTAAATTGAA TTGCAGAAGC 
GTGACTTCCC ACTTCTGCAA TTTCTAATGT TTCATGATGA TTAATTTTTG TATCTACAAA 
ATGAATGTTT GCCAATTTCG CCTCATTCAC TTTTATATAG TTAAGCACCC AAACTGCAAT 
ACGCGACTTA AATCGATATT GAAAAAGTAA ATATTCAATA AAACTTTCTT TAATTTGATT 
GAGTGTCTCT GACATCAAAT ACCCCATTTT AAGATTGCAA TCTTGaTAAT TCGTCATGCC 
AATTTTCGTT ACTTGGcTCT AGTTCCAACA ATTGATTTAA AATAGTAATT GCTTGTTCCT 
TTTGACCAAT TTCAATTAAA TAGAAATAAT AATCACTCAT AAAATCAATA TTTGTTTTCA 
TCGTTGGATA TGCTAATTCA AAGAAATGTT GAGCTTCTTT ATCTCGCTC 
(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 



CATCAACTCC 


TTAATTACAC 


TGTAAATGAT 


ATGCGTCTTT 


TTGACAACTA 


TATTTGTCAA 


ATCTACACCA 


AAAAATATGA 


TTATCCACCT 


ATGTATGACA 


TTTTGAAACA 


AACACCTCAA 


CGCCTACAAG 


TCATAATTGT 


TTACTTTCGT 


TACACCTTCC 


TGCATAATTA 


ACAGCATTCT 


AATTTTAGTA 


TGATGCACGC 


ATTTTCACTA 


AATCAAACCA 


TTCAAAGGAG 


ACTATTATGG 


CATTTACATT 


ATCTGCAATT 


CAACAAGCAC 


ATCAACAATT 


TACTGGTGTT 


GACTTTCCAA 


AACmTTCAA 

* 


AGCTTTTAAA 


GATATGGGGA 


TGACTTACAA 


TATCGTCAAC 


ATTCAAGATG 


GCACTGCAAC 


ATACGTACAT 


CAATCAGAAG 


ATGATATCGT 


TACGTCATCT 


GTAAAAAGTA 


ATCATCCTGT 


TGCTCAAAAA 


TCAAACAAAA 


CAATAGTTCA 


AGACGTCTTA 


ACTAGACATC 


AACAAGGGCA 


AACAGATTTT 


GAAACATTTT 


GTGATGAAAT 


GGCTGAAGCT 


GGCATTTATA 


AATGGCATAT 


CGATATTCmA 


GCGGGCACTT 


GTACTTATAT 


CGACTTGCAA 


GACCAAGCTG 


TTATTTCAGA 


ATTAATCCCT 


CAATAAACTA 


TATTTATAGC 


AACATTTTAA 


TTATTTCATA 


AAATTTTATT 


GATAATCATT 


ATCGTTCGGT 


ATAAAGTAAA 


TACTATATAC 


TACTTATGAG 


TGAGGTTGAT 


TATCATGATA 


ACTAACACTT 


TTATTTTAGG 


CATCACAGGC 


CCAACAAGTC 


TTGTCGTCAT 


TAGCATTATC 


GCTTTAATTA 


TTTTTGGTCC 


GAAAAAATTA 


CCACAATTTG 
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AGTCTCACGA TACACCCAGT AAGGAATCGA AACAACAGCG AGAGCAATAG CACTGACCAC 950 

ACCTTACTGG TTCACTTTAG CGAACTACGC CATCGGTTAG TAAAAATTTT ATTGTCGTTC 1020 

GTCATTACGG TCATCGTCGT ATATGTyTCA TCATTTTGGT GGATGACACC ATTCATAACG 1080 

TATATyACCC GgCACATGTG TcCTTACATG CATTTcATTC ACAGAAATGA TACAAATAAC 114 0 

GTG 1143 



10 

(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7953 base pairs 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: double 

{D> TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CAACGCCTGA ACGTAAACCA TATCGTTTCG CGATTTCCTC ATCTTGACTA TTTACTAAAA 60 

ACTCTCTCAT GGCGATTAAT GTTTCTTTTT CTTCTTTAGT TAATGGTAAT TCTAACTCAG 120 

25 CTGCTTTTTG ACGCAAAGTT GGATGACCAT CTCTAATGAT GTCTTTCATT GTTAACATAT 180 

ATTGCACCTT CCTTATTTTA ATTTGTTTTA GTTGAATGAC AGTAAAAAGG TTGTTAAGAT 240 

ACTCATACAT TTTTATGTGT AAATATCTAC AAAGTTAACC AACTACTGCC AATGTTTATT 300 

30 

TTAGATAGTA TATGTAAATT TTCAaGAtAT GCgTAATTGC gTTAAAAAAT GaTTAAAGTG 360 

TTGGTTTCAA GCAATGaTAC TTTAGAAATT TATTTATCAT CTTGACTTTA AAAATTATAT 420 

TATAAATGAC GTAACTGTCA ACAGATATAC TTAGTAxTGA AGATGTGTAA TGTAATTGTT 4 B0 

35 

TAAAATTGAT TTCCAAGCAG ATTTTATTTA TCATTTAATT TAAATAGCAA GTGGAGGTAC 540 

AAGTAATGAA ATTTGGAAAA ACAATCGCAG TAGTATTAGC ATCTAGTGTC TTGCTTGCAG 600 

GATGTACTAC GGATAAAAAA GAAATTAAGG CATATTTAAA GCAAGTGGAT AAAATTAAAG 6 60 

40 

ATGATGAAGA ACCAATTAAA ACTGTTGGTA AGAAAATTGC TGAATTAGAT GAGAAAAAGA 720 

AAAAATTAAC TGAAGATGTC AATAGTAAAG ATACAGCAGT TCGCGGTAAA GCAGTAAAGG 7 30 

45 ATTTAATTAA AAATGCCGAT GATCGTCTAA AGGAATTTGA AAAAGAAGAA GACGCAATTA B40 

AGAAGTCTGA ACAAGACTTT AAGAAAGCAA AAAGTCACGT TGATAACATT GATAATGATG 900 

TTAAACGTAA AGAAGTAAAA CAATTAGATG ATGTATTAAA AGAAAAATAT AAGTTACACA 960 

50 GTGATTACGC GAAAGCATaT AAAAAGGCTG TAAACTCAGA GAAAACATTA TTTAAATATT 1020 

TAAATCAAAA TGACGCGACA CAACAAGGTG TTAACGAAAA ATCAwAAGCA ATAGAACAGA 1080 
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AAGAAAAGCA AGACGTTGAT CAATTTAAAT AATTAATATA ATACAGATGG TAGGAAACAA 1200 

CTAATACAGT TCCTATTATC TGTATCTTTT TTTATTAAAA CAGAACTTTT TCAAATGGTT 1260 

TAACAGTCCC ATTTATTTGT GGTACAATTA GTAAGGATAA AATGAATTTC TATACAATTA 1320 

TGGGAAAGGT ATTGTGAATT GAATGGCTCC TAAGTTACAA GCCCAATTCG ATGCAGTAAA 13 30 

AGTTTTAAAT GATACTCAAT CGAAATTTGA AATGGTTCAA ATTTTGGATG AGAATGGTAA 1440 

CGTCGTAAAT GAAGACTTAG TACCTGATCT TACGGATGAA CAATTAGTGG AATTAATGGA 1500 

AAGAATGGTA TGGACTCGTA TCCTTGATCA ACGTTCTATC TCATTAAACA GACAAGGACG 15 SO 

TTTAGGTTTC TATGCACCAA CTGCTGGTCA AGAAGCATCA CAATTAGCGT CACAATACGC 1620 

TTTAGAAAAA GAAGATTACA TTTTACCGGG ATACAGAGAT GTTCCTCAAA TTATTTGGCA 1630 

TGGTTTACCA TTAACTGAAG CTTTCTTATT CTCAAGAGGT CACTTCAAAG GAAATCAATT 1740 

20 CCCTGAAGGC GTTAATGCAT TAAGCCCACA AATTATTATC GGTGCACAAT ACATTCAAGC 1300 

TGCTGGTGTT GCATTTGCAC TTAAAAAACG TGGTAAAAAT GCAGTTGCAA TCACTTACAC 1860 

TGGTGACGGT GGTTCTTCAC AAGGTGATTT CTACGAaGGT ATTAACTTTG CAGCAGCTTA 1920 

25 TAAAGCACCT GCAATTTTCG TTATTCAAAA CAATAACTAT GCAATTTCAA CACCAAGAAG 1980 

CAAGCAAACT GCTGCTGAAA CATTAGCTCA AAAAGCAATT GCTGTAGGTA TTCCTGGTAT 2040 

CCAAGTTGAT GGTATGGATG CGTTAgcTGT nATATCAAGC AACTAAAGAA GCACGTGACC 2100 

GCGCAgTTGC AGGTGAAGGT CCAACATTAA TTGAAACTAT GACATATCGT TATGGTCCTC 2160 

ATACAATGGC TGGTGACGAT CCAACTCGTT ACAGAACTTC AGACGAAGAT GCTGAATGGG 2220 

AGAAAAAAGA CCCATTAGTA CGTTTCCGTA AATTCCTTGA AAACAAAGGT TTATGGAATG 2 2 BO 

AAGACAAAGA AAATGAAGTT ATTGAACGTG CAAAAG CTGA TATTAAAGCA GCAATTAAAG 2340 

AGGCTGATAA CACTGAAAAA CAAACTGTTA CTTCTCTAAT GGAAATTATG TATGAAGATA 2400 

TGCCTCAAAA CTTAGCAGAA CAATATGAAA TTTACAAAGA GAAGGAGTCG AAGTAAGCCA 2460 

TGGCACAAAT GACAATGGTT CAAGCGATTA ATGATGCGCT TAAAACTGAA CTTAAAAATG 2520 

ACCAAGATGT TTTAATTTTT GGTGAAGACG TTGGTGTTAA CGGCGGTGTT TTCCGTGTTA 258 0 

45 CTGAAGGACT ACAAAAAGAA TTTGGTGAAG ATAGAGTATT CGATACACCT TTAGCTGAAT 264 0 

CAGGTATTGG TGGTTTAGCG ATGGGTCTTG CAGTTGAAGG ATTCCGTCCG GTTATGGAAG 270 0 

TACAATTCTT AGGTTTCGTA TTCGAAGTAT TTGATGCGAT TGCTGGACAA ATTGCACGTA 2760 

50 CTCGTTTCCG TTCAGGCGGT ACTAAAACTG CACCTGTAAC AATTCGTAGC CCATTTGGTG 2820 

GTGGCGTACA CACACCAGAA TTACACGCAG ATAACTTAGA AGGTATTTTA GCTCAATCTC 288 0 
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CTATTAGAAG TAATGACCCA GTCGTATACT -TAGAGCATAT GAAATTGTAT CGTTCATTCC 3 000 

GTGAAGAAGT ACCTGAAGAA GAATATACAA TTGACATTGG TAAGGCTAAT GTGAAAAAAG 3060 

AAGGTAATGA CATTTCAATC ATCACATACG GTGCAATGGT TCAAGAATCA ATGAAAGCTG 3120 

CAGAAGAACT TGAAAAAGAT GGTTATTCTG TTGAAGTAAT TGACTTACGT ACTGTTCAAC 3180 

CAATCGATGT TGACACAATT GTAGCTTCAG TTGAAAAAAC TGGTCGTGCA GTTGTAGTTC 3240 

AAGAAGCACA ACGTCAAGCT GGTGTTGGTG CAGCAGTTGT AGCTGAATTA AGTGAACGTG 3300 

CAATCCTTTC ATTAGAAGCA CCTATTGGAA GAGTTGCAGC AGCAGATACA ATTTATCCAT '3360 

TCACTCAAGC TGAAAATGTT TGGTTACCAA ACAAAAATGA CATCATCGAA AAAGCAAAAG 3420 

AAACTTTAGA ATTTTAATAC ATTTTAAAAG TTAACGAAGT TAGCGTATTT TAGTCTCATT 34 8 0 

GATTAAAATG AAATGTTTAA TTTACGAAAT CTTAGGAGGG CAAAAACGTG GCATTTGAAT 3540 

TTAGATTACC CGATATCGGG GAAGGTATCC ACGAAGGTGA AATTGTAAAA TGGTTTGTTA 3600 

AAGCTGGAGA TACTATTGAA GAAGACGATG TTTTAGCTGA GGTACAAAAC GATAAATCAG 3660 

TAGTAGAAAT CCCATCACCA GCATCTGGTA CTGTAGAAGA AGTTATGGTA GAAGAAGGTA 3720 

25 CAGTAGCTGT AGTTGGTGAC GTTATTGTTA AAATCGATGC ACCTGATGCA GAAGATATGC 3 780 

AATTTAAAGG TCATGATGAT GATTCATCAT CTAAAGAAGA ACCTGCGAAA GAGGAAGCGC 3 840 

CAgcAGaGCA AGCACCTGTA GCTACTCAAA CTGAAGAAGT AGATGAAAAC AGAACTGTTA 3900 

AAGCAATGCC TTCAGTACGT AAATACGCAC GTGAAAAAGG TGTTAACATT AAAGCAGTTT 3 960 

CTGGATCTGG TAAAAATGGT CGTATTACAA AAGAAGATGT AGATGCATAC TTAAATGGTG 4020 

GTGCACCAAC AGCTTCAAAT GAATCAGCTG CTTCAGCTAC AAGTGAAGAA GTTGCTGAAA 4080 

CTCCTGCAGC ACCTGCAGCA GTAACATTAG AAGGCGACTT CCCAGAAACA ACTGAAAAAA 414 0 

TCCCTGCTAT GCGTAGAGCA ATTGCGAAAG CAATGGTTAA CTCTAAGCAT ACTGCACCTC 4200 

ATGTAACATT AATGGATGAA ATTGATGTTC AAGCATTATG GGATCACCGT AAGAAATTTA 4260 

AAGAAATCGC AGCTGAACAA GGTACTAAGT TAACATTCTT ACCTTATGTT GTTAAAGCAC 4320 

TTGTTTCTGC ATTGAAAAAA TACCCAGCAC TTAACACTTC ATTCAATGAA GAAGCTGGTG 4380 

AAATCGTTCA TAAACATTAC TGGAATATCG GTATTGCAGC AGACACTGAT AGAGGATTAT 4440 

TAGTACCTGT TGTTAAACAT GCTGATCGTA AGTCTATTTT CCAAATTTCA GATGAAATTA 4 500 

ATGAATTAGC TGTTAAAGCA CGTGATGGTA AATTAACAGC CGATGAAATG AAAGGTGCTA 4560 

SO CATGCACAAT CAGTAATATC GGTTCAGCTG GTGGACAATG GTTCACTCCA GTTATCAATC 4 620 

ACCCAGAAGT AGCAATCTTA GGAATTGGCC GTATTGCTCA AAAACCTATC GTTAAAGATG 4630 
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ATGGTGCAAC TGGCCAAAAT GCAATGAATC ACATTAAACG TTTATTAAAT AATCCAGAAT 4800 

TATTATTAAT GGAGGGGTAA AACATGGTAG TTGGAGATTT CCCAATTGAA ACAGATACTA 4860 

TAGTAATCGG AGCAGGTCCT GGTGGATACG TTGCAGCAAT TCGTGCAGCT CAATTAGGAC 4920 

AAAAAGTAAC AATCGTTGAG AAAGGTAATC TTGGTGGTGT TTGCTTAAAC GTAGGATGTA 498 0 

TTCCTTCAAA AGCATTACTA CATGCTTCTC ACCGTTTTGT TGAAGCACAA CATTCTGAAA 5040 

ACTTAGGTGT TATTGCTGAA AGTGTTTCTT TAAACTTCCA AAAAGTTCAA GAATTCAAAT 5100 

CATCAGTTGT TAATAAATTA ACTGGTGGTG TTGAAAGCTT ACTTAAAGGT AACAAAGTTA 51 SO 

ACATCGTTAA AGGTGAAGCA TATTTCGTAG ATAACAATAG CTTACGTGTT ATGGACGAAA 5220 

AGAGCGCACA AACATACAAC TTTAAAAATG CAATCATTGC AACAGGTTCA AGACCAATTG 5280 

AAATTCCTAA TTTCAAATTC GGTAAACGTG TTATCGACTC AACAGGTGCT TTAAACTTAC 5340 

20 AAGAAGTACC aGGTAAATTA GTTGTAGTTG GTGGAGGATA CATTGGATCA GAATTAGGTA 54 00 

CAGCATTTGC TAACTTTGGT TCAGAAGTAA CCATCCTTGA AGGTGCTAAA GATATCTTAG 54 60 

GTGGCTTCGA AAAACAAATG ACACAACCTG TTAAAAAAGG TATGAAAGAA AAAGGTGTTG 552 0 

25 AAAT.CGTTAC TGAAGCTATG GCTAAATCAG CTGAAGAAAC AGATAACGGA GTTAAAGTTA 5580 

CTTATGAAGC TAAAGGCGAA GAGAAAACAA TCGAAGCTGA TTATGTATTA GTAACTGTAG 5640 

GTCGTCGTCC AAACACAGAC GAATTAGGCC TAGAAGAATT AGGTGTTAAA TTCGCTGACC 5700 

GTGGATTATT AGAAGTTGAT AAACAAAGCC GTACGTCTAT CAGCAATATC TATGCAATTG 5760 

GTGATATCGT TCCAGGTTTA CCACTTGCTC ACAAAGCTAG CTATGAAGCT AAAGTTGCTG 5820 

CTGAAGCAAT TGATGGTCAA GCTGCTGAAG TTGATTACAT TGGTATGCCA GCAGTATGCT 5880 

TTACTGAACC AGAATTAGCT ACAGTTGGTT ATTCAGAAGC GCAAGCTAAA GAAGAAGGTT 5 940 

TAGCAATTAA AGCTTCTAAA TTCCCATATG CAGCAAATGG TCGTGCATTA TCATTAGATG 60 00 

ATACTAACGG ATTTGTTAAA CTTATTACAC TTAAAGAAGA TGATACTTTA ATCGGTGCTC 60 60 

AAGTAGTTGG TACTGGTGCA TCAGATATTA TCTCTGAATT AGGTTTAGCA ATTGAAGCTG 6120 

GTATGAATGC TGAAGATATC GCATTAACAA TCCATGCACA TCCAACATTA GGTGAGATGA 6180 

45 CTATGGAAGC AGCAGAAAAA GCTATCGGAT ACCCAATCCA TACAATGTAA TAACTGATTA 624 0 

TCTATAAAGA TTCAGTCATT AAAAGCTGTA GCATATGCTA CGGCTTTTTT GTTTTAGGTA 63 00 

AAGTAATGTA AGGAAATTGA TTTGAGATAT CGTTAACATG TGACATGCAT GTTATACTAG 63 60 

CGATGCTAAT AAAAGAATTG AAATGGAGGG TTCAACAATG GAATATGAGT ATCCAATTGA 64 20 

TTTAGACTGG AGTAATGAAG AGATGATTTC AGTGATAAAT TTCTTTAATC ATGTAGAGAA 64 80 
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AATTGTGCCT GCTAAAGCAG AGGAAAAACA AATTTTTAAT ACTTTCGAAA AAAGTAGTGG 6600 

CTATAATAGT TACAAAGCAG TTCAAGATGT AAAAACTCAC TCTGAAGAAC AAAGAGTAAC 6660 

5 AGCTAAAnAA TAATTCGTTC GAAATTAACA CAATTTAATA GGAATTTTTC TTTAAAACTA 6720 

TTGCTAATAA AGCTATATTT TGATACCTTT ATCAAGTGTT AAACAAAATG TTTGATAAAA 6780 

GTAAACTTAA TATAGCTTTT TTAGGTGGAA AAATAAATGA ACATAGGTAA TAAAATTAAA 6840 

10 

AATCTTAGAA GAATTAAAAA TTTAACGCAA GAAGAACTTG CTGAACGTAC AGACTTATCG 6900 

AAAGGCTACA TTTCACAAAT AGAAAGTGAA CATGCCTCAC CAAGTATGGA AACTTTCTTA 6960 

AATATTATAG AGGTGTTAGG AACGACGCCA AGTGAATTTT TTAAAGACAG TGAAAATGAA 7020 

75 

AAAGTATTAT ACAAGAAGGA AGAACAAGTT ATTTATGATG AGTATGATGA AGGTTATATA 7080 

TTAAATTGGT TAGTTTCAAA GTCAAATGAA TATGATATGG AGCCATTAAT ATTAACTTTA 7140 

2Q AAGCCTGGAG CAT CAT AT AA AAATTTTAAT CCATCAGAGT CTGATACGTT TATTTATTGT 7200 

ATGTCAGGTC AGATAACACT TAATTTAGGC AAAGAGATAT ATCAAGCACA AGAAGAAGAC 7260 

GTTTTGTATT TTAAAGCACG AGATAATCAT CGTTTGTCAA ACGAATCAAA CAATGAAACA 73 20 

25 CGAATACTTA TTGTAGCGAC AGCTTCATAT TTATAGGGGG GATCTTATTT GGAACCGTTA 7380 

TTATCATTAA AATCAGTTAG TAAAAGCTAT GATGATCTTA ATATCTTAGA TGACATAGAT 744 0 

ATTGATATTG AATCAGGATA CTTTTATACA TTATTAGGTC CTTCAGGTTG TGGTAAAACA 7500 

30 ACAATTTTAA AATTAATTGC AGGGTTTGAA TATCCTGACA GTGGTGAAGT GATTTATCAA 7560 

AACAAACCAA TTGGTAATTT ACCACCAAAT AAACGTAAAG TGAATACAGT CTTTCAAGAT 7620 

TATGCATTAT TTCCACACTT AAACGTCTAT GATAATATCG CTTTTGGTTT GAAATTAAAA 7680 

35 

AAATTATCAA AAACCGAAAT TGATCAAAAA GTAACTGAGG CATTAAAATT AGTAAAACTT 774 0 

TCAGGTTATG AAAAAAGAAA TATTAATGAA ATGAGTGGCG GACAAAAGCA ACGTGTTGCA 7800 

* 

ATTGCACGTG CTATCGTAAA TGAACCAGAA ATATTATTGT TAGATGAATC TTTATCCGCA 7 860 

40 

TTAGATTTGA AATTGCGTAC TGAAATGCAA TATGAATTAC GAGAATTGCa ATCTAGATTA 7920 

GG tATTACAT TTATATTTGT aACACATGAT CCA 7953 
45 (2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : double 
50 (D) TOPOLOGY: linear 
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GGCGTGATCA TACGACCGTC ATTCATGCTC ATGAAAAAAT ATCTAAAGAT TTAAAAGAAG 60 

ATCCTATTTT TAAACAAGAA GTAGAGAATC TTGAAAAAGA AATAAGAAAT ' GT ATAAGT AG 120 

GAAACTTTGG GAAATGTAAT CTGTTATATA ACAGCACTAA TGATnACAAT CATTTTTTAC 180 

ATTTCTATAT GCTAATGTGG CAAGATGAGC AAAACTCATT TTGTGGATaA TGTTTaAAAG 240 

TCATACACAC CATACACAAG TTATCAACAT GTGTATAAyT cGcCAAATCT ATGTTTTTAA 3 00 

GACTTATCCA CCAATCCACA GCACCTACTA CTATTACTAA GAACTTAAAA CCTATATAAT 3 60 

TATATATAAA CGACTGGAAG GAGTTTTAAT TAATGATGGA ATTcACTATT AAAAGAGATT 420 

ATTTTATTAC ACAATTaAAT GACACATTAA AAGCTATTTC ACCAAGaACA ACATTACCTA 480 

TATTAACTGG TATCAAAATC GATGCGAAAG AACATGAAGT TATATTaACT GGTTCAGACT 540 

CTGAAATTTC AATAGAAATC ACTATTCCTA AAACTGTAGA TGGCGAAGAT ATTGTCAATA 600 

2Q TTTCAGAAAC AGGCTCAGTA GTACTTCCTG GACGATTCTT TGTTGATATT ATAAAAAAAT 660 

TACCTGGTAA AGATGTTAAA TTATCTACAA ATGAACAATT CCAGACATTA ATTACATCAG 720 

GTCATTCTGA ATTTAATTTA AGTGG CTTAG ATCCAGATCA ATATCCTTTA TTACCTCAAG 78 0 

25 TTTCTAGAGA TGACGCAATT CAATTGTCGG TAAAAGTGCT TAAAAACGTG ATTGCACAAA 840 

CAAATTTTGC AGTGTCCAcC TCAGAAACAC GCCCAGTACT AACTGGTGTG AACTGGCTTA 900 

TACAAGAAAA TGAATTAATA TGCACAGCGA CTGACTCACA CCGCTTGGCT GTAAGAAAGT 960 

TGCAGTTAGA AGATGTTTCT GAAAACAAAA ATGTCATCAT TCCAGGTAAG GCTTTAGCTG 1020 

AATTAAATAA AATTATGTCT GACAATGAAG AAGACATTGA TATCTTCTTT GCTTCAAACC 1080 

AAGTTTTATT TAAAGTTGGA AATGTGAACT TTATTTCTCG ATTATTAGAA GGACATTATC 114 0 

CTGATACAAC ACGTTTATTC CCTGAAAACT ATGAAATTAA ATTAAGTATA GACAATGGGG 1200 

AGTTTTATCA TGCGATTGAT CGTGCCTCTT TATTAGCGCG TGAAGGTGGT AATAACGTTA 1260 

TTAAATTAAG TACAGGTGAT GACGTTGTTG AATTGTCTTC TACATCACCA GAAATTGGTA 13 20 

CTGTAAAAGA AGAAGTTGAT GCAAACGATG TTGAAGGTGG TAG CCTG AAA ATTTCATTCA 13 80 

ACTCTAAATA TATGATGGAT GCTTTAAAAG CAATCGATAA TGATGAGGTT GAAGTTGAAT 1440 

45 TCTTCGGTAC AATGAAACCA TTTATTCTAA AACCAAAAGG TGACGACTCG GTAACGCAAT 150 0 

TAATTTTACC AATCAGAACT TACTAAAAAT AAATATAAAT AAAGGATGAC GTGATTAATT .1560 

AAAACGTCAT CCTTTATTTT TTGGCAAAAA TAATTCTAGG TGCGTATGTA AAATAAATTT 1620 

50 GGCAGCATTT TAAACAGCAA ATAAAAGACG CCAATTAAAT TTATGACAAA TGTATCCAAA 1680 

ATTTAATAAG TGTGCTTATA TGCCCTTTAA ATTTAAAATT TTAATAGTCA ATAACAAGTT 174 0 
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AAAAATAAGA ATTAATTATT TATATGTAAA CGGTTTCTAC CTCTATTTTA AATGAAATTT I860 

GTGACAAAAA AAGGTATAAT ATATTAATGA CATACAAAGA AATGGAGTGA TTATTTTGGT 1920 

TCAAGAAGTT GTAGTAGAAG GAGACATTAA TTTAGGTCAA TTTCTAAAAA CAGAAGGGAT 1980 

TATTGAATCT GGTGGTCAAG CAAAATGGTT CTTGCAAGAC GTTGAAGTAT TAATTAATGG 204 0 

AGTGCGTGAA ACACGTCGCG GTAAAAAGTT AGAACATCAA GATCGTATAG ATATCCCAGA 2100 

ATTACCTGAA GATGCTGGTT CTTTCTTAAT CATTCATCAA GGTGAACAAT GAAGTTAAAT 2160 

ACACTCCAAT TAGAAAATTA TCGTAACTAT GATGAGGTTA CGTTGAAATG TCATCCTGAC 2220 

GTGAATATCC TCATTGGAGA AAATGCACAA GGGAAAGACA AATTTACTTG GAATCAATTT 2280 

ATACCTTAGC TTTAGCAAAA AGTCATAGAA CGAGTAATGG ATAAGGGACT CCATACCGTT 2340 

TTAATGC 2347 

20 (2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 542 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDSDNESS: double 
25 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

ACAAGACGTri T CT AT AACTT ATCTGAAATC GCTCGTCAAG ATAAAGATTA TGCAACTATC 60 

TCATTCTTAA ACTGGTTCTT AGATGAACAA GTCGAAGAAG AATCAATGTT TGAAACTCAC 120 

ATCAATTATT TAACTCGTAT CGGCGATGAC AGCAATGCAT TATATCTTTA CGAAAAAGAA 180 

CTTGGCGCTC GTACATTCGA CGAAGAATAA TTAAACATCA CTACAATAGA CAGATAAATA 24 0 

TCATACGACA TGATAGGCAT TTGGGTCACT TACAATAACC CAATGTCTAT ATTATTTTGC 3 00 

TTTACGGAGA TCACTAGATT CATTTTCTGA ATCATTGATC TGCGTTTTTT CATTTTCAAG 360 

GCTAATTATT GTATTTTTAG TCATTTATTT TTTAAACTAC TAATGTTAAT AACTCTAAAT 4 20 

TTGATGTTGA ATTAATTTGA CGATTTTAAA GCATATCATC ATTTACTTTT TAATCAGAGT 48 0 

45 TACATCCAAA TGATAGATTT CACGTTATAC CTTCACGTAT AATATTATGT ATCGTTTGTA 54 0 

AGCAAATGAC TAAAAGTCTA TTAATATATA CATTTAATTA ATTGAAAGGA TTGACTACAT 50 0 

GATACAAGAT GCGTTTGTTG CACTTGATTT TGAAACAGCA AATGGTAAAC GTACAAGTAT 66 0 

50 TTGTTCTGTC GGAATGGTTA AAGTCATTGA TAGTCAAATA ACAGAAACAT TTCATACTCT 720 

TGTGAATCCG CAAGACTATT TTTCACAACA AAATATTAAA ATTCATGGCA TACAACCAGA 780 
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